Clean Slang in Texts

Description

The clean_slang is a pretrained pipeline that we can use to process text with a simple pipeline that performs basic processing steps and recognizes entities . It performs most of the common text processing tasks on your dataframe.

Download Copy S3 URI

How to use

pipeline = PretrainedPipeline('clean_slang', lang='en')

testDoc = '''
yo, what is wrong with ya?
'''

val pipeline = new PretrainedPipeline("clean_slang", lang = "en")
val result = pipeline.fullAnnotate("Hello from John Snow Labs ! ")(0)

import nlu
text = [""Hello from John Snow Labs ! ""]
result_df = nlu.load('en.clean.slang').predict(text)
result_df

Results

['hey', 'what', 'is', 'wrong', 'with', 'you']

Model Information

Model Name:	clean_slang
Type:	pipeline
Compatibility:	Spark NLP 3.3.4+
License:	Open Source
Edition:	Official
Language:	en
Size:	19.1 KB

Included Models

DocumentAssembler
TokenizerModel
NormalizerModel

PREVIOUSDetect Restaurant-related Terminology

NEXTFinnish BERT Embeddings (Base Cased)