Description
Identify Racism, Sexism or Neutral tweets.
Predicted Entities
neutral
, racism
, sexism
Live Demo Open in Colab Download Copy S3 URI
How to use
document_assembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")
use = UniversalSentenceEncoder.pretrained('tfhub_use', lang="en") \
.setInputCols(["document"])\
.setOutputCol("sentence_embeddings")
document_classifier = ClassifierDLModel.pretrained('classifierdl_use_cyberbullying', 'en') \
.setInputCols(["document", "sentence_embeddings"]) \
.setOutputCol("class")
nlpPipeline = Pipeline(stages=[document_assembler, use, document_classifier])
light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))
annotations = light_pipeline.fullAnnotate('@geeky_zekey Thanks for showing again that blacks are the biggest racists. Blocked')
val documentAssembler = DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val use = UniversalSentenceEncoder.pretrained(lang="en")
.setInputCols(Array("document"))
.setOutputCol("sentence_embeddings")
val document_classifier = ClassifierDLModel.pretrained("classifierdl_use_cyberbullying", "en")
.setInputCols(Array("document", "sentence_embeddings"))
.setOutputCol("class")
val pipeline = new Pipeline().setStages(Array(documentAssembler, use, document_classifier))
val data = Seq("@geeky_zekey Thanks for showing again that blacks are the biggest racists. Blocked").toDF("text")
val result = pipeline.fit(data).transform(data)
import nlu
text = ["""@geeky_zekey Thanks for showing again that blacks are the biggest racists. Blocked"""]
cyberbull_df = nlu.load('classify.cyberbullying.use').predict(text, output_level='document')
cyberbull_df[["document", "cyberbullying"]]
import nlu
nlu.load("en.classify.cyberbullying").predict("""@geeky_zekey Thanks for showing again that blacks are the biggest racists. Blocked""")
import nlu
nlu.load("en.classify.cyberbullying").predict("""@geeky_zekey Thanks for showing again that blacks are the biggest racists. Blocked""")
import nlu
nlu.load("en.classify.cyberbullying").predict("""@geeky_zekey Thanks for showing again that blacks are the biggest racists. Blocked""")
Results
+--------------------------------------------------------------------------------------------------------+------------+
|document |class |
+--------------------------------------------------------------------------------------------------------+------------+
|@geeky_zekey Thanks for showing again that blacks are the biggest racists. Blocked. | racism |
+--------------------------------------------------------------------------------------------------------+------------+
Model Information
Model Name: | classifierdl_use_cyberbullying |
Compatibility: | Spark NLP 2.7.1+ |
License: | Open Source |
Edition: | Official |
Input Labels: | [sentence_embeddings] |
Output Labels: | [class] |
Language: | en |
Dependencies: | tfhub_use |
Data Source
This model is trained on cyberbullying detection dataset. https://raw.githubusercontent.com/dhavalpotdar/cyberbullying-detection/master/data/data/data.csv
Benchmarking
precision recall f1-score support
neutral 0.72 0.76 0.74 700
racism 0.89 0.94 0.92 773
sexism 0.82 0.71 0.76 622
accuracy 0.81 2095
macro avg 0.81 0.80 0.80 2095
weighted avg 0.81 0.81 0.81 2095