Spanish Skipgram Legal Fast Text Embeddings (Uncased, D50)

Description

Word Embeddings lookup annotator that maps tokens to vectors. In the Skip-gram model, the distributed representation of the input word is used to predict the context.

Download Copy S3 URI

How to use

 
model = WordEmbeddingsModel.pretrained("word2vec_skipgram_legal_d50_uncased","es")\
	            .setInputCols(["document","token"])\
	            .setOutputCol("word_embeddings")

val model = WordEmbeddingsModel.pretrained("word2vec_skipgram_legal_d50_uncased","es")
	                .setInputCols("document","token")
	                .setOutputCol("word_embeddings")

import nlu
nlu.load("es.embed.legal.skipgram.uncased_d50").predict("""Put your text here.""")

Model Information

Model Name:	word2vec_skipgram_legal_d50_uncased
Type:	embeddings
Compatibility:	Spark NLP 4.2.1+
License:	Open Source
Edition:	Official
Input Labels:	[document, token]
Output Labels:	[embeddings]
Language:	es
Size:	172.2 MB
Case sensitive:	false
Dimension:	100

References

https://zenodo.org/record/5036147#.Y3Op0XZBxD-

PREVIOUSSpanish Skipgram Legal Fast Text Embeddings (Cased, D50)

NEXTSpanish Named Entity Recognition, (RoBERTa base trained with data from the National Library of Spain (BNE) and CONLL 2003 data), by the TEMU Unit of the BSC-CNS