English BertForSequenceClassification Cased model (from bergum)

Description

Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. xtremedistil-l6-h384-go-emotion is a English model originally trained by bergum.

Predicted Entities

confusion 😕, nervousness 😬, gratitude 🙏, optimism 🤞, fear 😨, remorse 😞, excitement 🤩, relief 😅, disgust 🤮, sadness 😞, approval 👍, admiration 👏, amusement 😂, love ❤️, disapproval 👎, pride 😌, joy 😃, annoyance 😒, grief 😢, anger 😡, surprise 😲, embarrassment 😳, curiosity 🤔, realization 💡, caring 🤗, desire 😍, neutral 😐, disappointment 😞

Download Copy S3 URI

How to use

documentAssembler = DocumentAssembler() \
        .setInputCol("text") \
        .setOutputCol("document")

tokenizer = Tokenizer() \
    .setInputCols("document") \
    .setOutputCol("token")

sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_xtremedistil_l6_h384_go_emotion","en") \
    .setInputCols(["document", "token"]) \
    .setOutputCol("class")

pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded])

data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text")

result = pipeline.fit(data).transform(data)
val documentAssembler = new DocumentAssembler() 
          .setInputCol("text") 
          .setOutputCol("document")

val tokenizer = new Tokenizer() 
    .setInputCols(Array("document"))
    .setOutputCol("token")

val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_xtremedistil_l6_h384_go_emotion","en") 
    .setInputCols(Array("document", "token")) 
    .setOutputCol("class")

val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded))

val data = Seq("PUT YOUR STRING HERE").toDF("text")

val result = pipeline.fit(data).transform(data)

Model Information

Model Name: bert_classifier_xtremedistil_l6_h384_go_emotion
Compatibility: Spark NLP 4.2.0+
License: Open Source
Edition: Official
Input Labels: [document, token]
Output Labels: [class]
Language: en
Size: 84.5 MB
Case sensitive: true
Max sentence length: 256

References

  • https://huggingface.co/bergum/xtremedistil-l6-h384-go-emotion
  • https://colab.research.google.com/github/jobergum/emotion/blob/main/TrainGoEmotions.ipynb
  • https://aiserv.cloud/
  • https://github.com/jobergum/browser-ml-inference
  • https://paperswithcode.com/sota?task=Multi+Label+Text+Classification&dataset=go_emotions