Description
Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. Robertabase_Ana4 is a English model originally trained by vinaydngowda.
Predicted Entities
Credit reporting, credit repair services, or other personal consumer reports, Payday loan, title loan, or personal loan, Credit card or prepaid card, Money transfer, virtual currency, or money service, Mortgage, Student loan, Checking or savings account, Debt collection, Vehicle loan or lease
How to use
documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")
tokenizer = Tokenizer() \
.setInputCols("document") \
.setOutputCol("token")
sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_robertabase_ana4","en") \
.setInputCols(["document", "token"]) \
.setOutputCol("class")
pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier])
data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text")
result = pipeline.fit(data).transform(data)
val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val tokenizer = new Tokenizer()
.setInputCols("document")
.setOutputCol("token")
val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_robertabase_ana4","en")
.setInputCols(Array("document", "token"))
.setOutputCol("ner")
val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier))
val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text")
val result = pipeline.fit(data).transform(data)
Model Information
| Model Name: | bert_sequence_classifier_robertabase_ana4 |
| Compatibility: | Spark NLP 4.3.1+ |
| License: | Open Source |
| Edition: | Official |
| Input Labels: | [document, token] |
| Output Labels: | [ner] |
| Language: | en |
| Size: | 1.3 GB |
| Case sensitive: | true |
| Max sentence length: | 128 |
References
- https://huggingface.co/vinaydngowda/Robertabase_Ana4