Description
Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.distilbert_base_uncased_finetuned_legal_data
is a English model originally trained by MariamD.
How to use
document_assembler = MultiDocumentAssembler() \
.setInputCol(["question", "context"]) \
.setOutputCol(["document_question", "document_context"])
spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_legal_data","en") \
.setInputCols(["document_question","document_context"]) \
.setOutputCol("answer")
pipeline = Pipeline().setStages([document_assembler, spanClassifier])
pipelineModel = pipeline.fit(data)
pipelineDF = pipelineModel.transform(data)
val document_assembler = new MultiDocumentAssembler()
.setInputCol(Array("question", "context"))
.setOutputCol(Array("document_question", "document_context"))
val spanClassifier = DistilBertForQuestionAnswering
.pretrained("distilbert_base_uncased_finetuned_legal_data", "en")
.setInputCols(Array("document_question","document_context"))
.setOutputCol("answer")
val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier))
val pipelineModel = pipeline.fit(data)
val pipelineDF = pipelineModel.transform(data)
Model Information
Model Name: | distilbert_base_uncased_finetuned_legal_data |
Compatibility: | Spark NLP 5.2.0+ |
License: | Open Source |
Edition: | Official |
Input Labels: | [document_question, document_context] |
Output Labels: | [answer] |
Language: | en |
Size: | 247.2 MB |
References
https://huggingface.co/MariamD/distilbert-base-uncased-finetuned-legal_data