Description
Identify Racism, Sexism or Neutral tweets.
Predicted Entities
neutral
, racism
, sexism
.
Live Demo Open in Colab Download Copy S3 URI
How to use
documentAssembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")
use = UniversalSentenceEncoder.pretrained(lang="en") \
.setInputCols(["document"])\
.setOutputCol("sentence_embeddings")
document_classifier = ClassifierDLModel.pretrained('classifierdl_use_cyberbullying', 'en') \
.setInputCols(["document", "sentence_embeddings"]) \
.setOutputCol("class")
nlpPipeline = Pipeline(stages=[documentAssembler, use, document_classifier])
light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))
annotations = light_pipeline.fullAnnotate('@geeky_zekey Thanks for showing again that blacks are the biggest racists. Blocked')
val documentAssembler = DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val use = UniversalSentenceEncoder.pretrained(lang="en")
.setInputCols(Array("document"))
.setOutputCol("sentence_embeddings")
val document_classifier = ClassifierDLModel.pretrained("classifierdl_use_cyberbullying", "en")
.setInputCols(Array("document", "sentence_embeddings"))
.setOutputCol("class")
val pipeline = new Pipeline().setStages(Array(documentAssembler, use, document_classifier))
val data = Seq("@geeky_zekey Thanks for showing again that blacks are the biggest racists. Blocked").toDF("text")
val result = pipeline.fit(data).transform(data)
import nlu
text = ["""@geeky_zekey Thanks for showing again that blacks are the biggest racists. Blocked"""]
cyberbull_df = nlu.load('classify.cyberbullying.use').predict(text, output_level='document')
cyberbull_df[["document", "cyberbullying"]]
Results
+--------------------------------------------------------------------------------------------------------+------------+
|document |class |
+--------------------------------------------------------------------------------------------------------+------------+
|@geeky_zekey Thanks for showing again that blacks are the biggest racists. Blocked. | racism |
+--------------------------------------------------------------------------------------------------------+------------+
Model Information
Model Name | classifierdl_use_cyberbullying |
Model Class | ClassifierDLModel |
Spark Compatibility | 2.5.3 |
Spark NLP Compatibility | 2.4 |
License | open source |
Edition | public |
Input Labels | [document, sentence_embeddings] |
Output Labels | [class] |
Language | en |
Upstream Dependencies | tfhub_use |
Data Source
This model is trained on cyberbullying detection dataset. https://raw.githubusercontent.com/dhavalpotdar/cyberbullying-detection/master/data/data/data.csv
Benchmarking
precision recall f1-score support
none 0.69 1.00 0.81 3245
racism 0.00 0.00 0.00 568
sexism 0.00 0.00 0.00 922
accuracy 0.69 4735
macro avg 0.23 0.33 0.27 4735
weighted avg 0.47 0.69 0.56 4735