Description
Pretrained XlmRoBertaSentenceEmbeddings
model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. sent_xlm_roberta_biolord_2023_m
is a multilingual model originally trained by FremyCompany. It supports English, Spanish, French, German, Dutch, Danish and Swedish.
Predicted Entities
How to use
documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")
embeddings = XlmRoBertaSentenceEmbeddings.pretrained("sent_xlm_roberta_biolord_2023_m","xx") \
.setInputCols(["document"]) \
.setOutputCol("embeddings")
pipeline = Pipeline().setStages([documentAssembler, embeddings])
data = spark.createDataFrame([["Disfruto trabajando con Spark-NLP."]]).toDF("text")
pipelineModel = pipeline.fit(data)
result = pipelineModel.transform(data)
val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val embeddings = XlmRoBertaSentenceEmbeddings
.pretrained("sent_xlm_roberta_biolord_2023_m", "xx")
.setInputCols(Array("document"))
.setOutputCol("embeddings")
val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings))
val data = Seq("Disfruto trabajando con Spark-NLP.").toDF("text")
val pipelineModel = pipeline.fit(data)
val result = pipelineModel.transform(data)
Results
+----------------------------------+----------------------------------------------------------------------+----------------------------------------------------------------------+
| text| document| sentence_embeddings|
+----------------------------------+----------------------------------------------------------------------+----------------------------------------------------------------------+
|Disfruto trabajando con Spark-NLP.|[{document, 0, 33, Disfruto trabajando con Spark-NLP., {sentence ->...|[{sentence_embeddings, 0, 33, Disfruto trabajando con Spark-NLP., {...|
+----------------------------------+----------------------------------------------------------------------+----------------------------------------------------------------------+
Model Information
Model Name: | sent_xlm_roberta_biolord_2023_m |
Compatibility: | Spark NLP 5.5.2+ |
License: | Open Source |
Edition: | Official |
Input Labels: | [document] |
Output Labels: | [xlm_sentence_embeddings] |
Language: | xx |
Size: | 1.0 GB |
References
https://huggingface.co/FremyCompany/BioLORD-2023-M
PREVIOUSPhi-3-vision-128k-instruct
NEXTQwen2-VL