E5 Base Sentence Embeddings

Description

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei, arXiv 2022

Predicted Entities

Download Copy S3 URI

How to use

embeddings =E5Embeddings.pretrained("e5_base","en") \
            .setInputCols(["documents"]) \
            .setOutputCol("instructor")

pipeline = Pipeline().setStages([document_assembler, embeddings])
val embeddings = E5Embeddings.pretrained("e5_base","en")
      .setInputCols(["document"])
      .setOutputCol("e5_embeddings")
val pipeline = new Pipeline().setStages(Array(document, embeddings))

Model Information

Model Name: e5_base
Compatibility: Spark NLP 5.1.0+
License: Open Source
Edition: Official
Input Labels: [documents]
Output Labels: [e5]
Language: en
Size: 258.6 MB