Description
Predicts the affiliation, if any, of the information in a paragraph.
Predicted Entities
How to use
from sparknlp.annotator import *
from sparknlp.base import *
document_assembler = DocumentAssembler() \
.setInputCol('text') \
.setOutputCol('document')
tokenizer = Tokenizer() \
.setInputCols(['document']) \
.setOutputCol('token')
sequence_classifier = RoBertaForSequenceClassification.load(MODEL_NAME)
.setInputCols(["document",'token'])\
.setOutputCol("class")
pipeline = Pipeline(stages=[
document_assembler,
tokenizer,
sequence_classifier
])
# couple of simple examples
example = spark.createDataFrame([["I love you!"], ['I feel lucky to be here.']]).toDF("text")
result = pipeline.fit(example).transform(example)
# result is a DataFrame
result.select("text", "class.result").show()
Model Information
Model Name: | Affiliation_Classifier_Roberta |
Compatibility: | Spark NLP 5.2.0+ |
License: | Open Source |
Edition: | Community |
Input Labels: | [document, token] |
Output Labels: | [class] |
Language: | en |
Size: | 441.4 MB |
Case sensitive: | true |
Max sentence length: | 128 |
Dependencies: | None |