Description
This French Word2Vec model was trained by Jean-Philippe Fauconnier on the frWaC Corpus over a window size of 100 and dimensions of 200.
Predicted Entities
How to use
documentAssembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")
tokenizer = Tokenizer()\
.setInputCols("document")\
.setOutputCol("token")
embeddings = WordEmbeddingsModel.pretrained("word2vec_wac_200", "fr")\
.setInputCols(["document", "token"])\
.setOutputCol("embeddings")
val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val tokenizer = new Tokenizer()
.setInputCols("document")
.setOutputCol("token")
val embeddings = WordEmbeddingsModel.pretrained("word2vec_wac_200", "fr")
.setInputCols("document", "token")
.setOutputCol("embeddings")
import nlu
nlu.load("fr.embed.word2vec_wac_200").predict("""Put your text here.""")
Model Information
| Model Name: | word2vec_wac_200 |
| Type: | embeddings |
| Compatibility: | Spark NLP 3.4.0+ |
| License: | Open Source |
| Edition: | Official |
| Input Labels: | [document, token] |
| Output Labels: | [embeddings] |
| Language: | fr |
| Size: | 118.0 MB |
| Case sensitive: | false |
| Dimension: | 200 |
References
This model was trained by Jean-Philippe Fauconnier on the frWaC Corpus. [1]
[1] Fauconnier, Jean-Philippe (2015), French Word Embeddings, http://fauconnier.github.io