Description
Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. autonlp-txc-17923124 is a English model originally trained by emekaboris.
Predicted Entities
15.0, 24.0, 10.0, 8.0, 4.0, 17.0, 3.0, 23.0, 5.0, 6.0, 1.0, 21.0, 18.0, 19.0, 14.0, 16.0, 20.0, 7.0, 13.0, 11.0, 12.0, 9.0, 22.0, 2.0
How to use
documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")
tokenizer = Tokenizer() \
.setInputCols("document") \
.setOutputCol("token")
roberta_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_txc_17923124","en") \
.setInputCols(["document", "token"]) \
.setOutputCol("class")
pipeline = Pipeline(stages=[documentAssembler, tokenizer, roberta_classifier])
data = spark.createDataFrame([["I love you!"], ["I feel lucky to be here."]]).toDF("text")
result = pipeline.fit(data).transform(data)
val documentAssembler = new DocumentAssembler()
.setInputCols("text")
.setOutputCols("document")
val tokenizer = new Tokenizer()
.setInputCols("document")
.setOutputCol("token")
val roberta_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_txc_17923124","en")
.setInputCols(Array("document", "token"))
.setOutputCol("class")
val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, roberta_classifier))
val data = Seq("I love you!").toDS.toDF("text")
val result = pipeline.fit(data).transform(data)
import nlu
nlu.load("en.classify.roberta.by_emekaboris").predict("""I feel lucky to be here.""")
Model Information
| Model Name: | roberta_classifier_autonlp_txc_17923124 |
| Compatibility: | Spark NLP 4.2.4+ |
| License: | Open Source |
| Edition: | Official |
| Input Labels: | [document, token] |
| Output Labels: | [class] |
| Language: | en |
| Size: | 427.0 MB |
| Case sensitive: | true |
| Max sentence length: | 128 |
References
- https://huggingface.co/emekaboris/autonlp-txc-17923124