sparknlp.annotator.embeddings.e5v_embeddings#

Module Contents#

Classes#

E5VEmbeddings

Universal multimodal embeddings using the E5-V model (see https://huggingface.co/royokong/e5-v).

class E5VEmbeddings(classname='com.johnsnowlabs.nlp.embeddings.E5VEmbeddings', java_model=None)[source]#

Universal multimodal embeddings using the E5-V model (see https://huggingface.co/royokong/e5-v).

E5-V bridges the modality gap between different input types (text, image) and demonstrates strong performance in multimodal embeddings, even without fine-tuning. It also supports a single-modality training approach, where the model is trained exclusively on text pairs, often yielding better performance than multimodal training.

Pretrained models can be loaded with pretrained() of the companion object:

>>> e5vEmbeddings = E5VEmbeddings.pretrained()     ...     .setInputCols(["image_assembler"])     ...     .setOutputCol("e5v")

The default model is "e5v_int4", if no name is provided.

For available pretrained models please see the Models Hub.

Input Annotation types

Output Annotation type

IMAGE

SENTENCE_EMBEDDINGS

<image>nSummary above image in one word: <|eot_id|><|start_header_id|>assistant<|end_header_id|>

>>> test_df = image_df.withColumn("text", lit(imagePrompt))
>>> imageAssembler = ImageAssembler()     ...     .setInputCol("image")     ...     .setOutputCol("image_assembler")
>>> e5vEmbeddings = E5VEmbeddings.pretrained()     ...     .setInputCols(["image_assembler"])     ...     .setOutputCol("e5v")
>>> pipeline = Pipeline().setStages([
...     imageAssembler,
...     e5vEmbeddings
... ])
>>> result = pipeline.fit(test_df).transform(test_df)
>>> result.select("e5v.embeddings").show(truncate = False)

Text-Only Embedding: >>> from sparknlp.util import EmbeddingsDataFrameUtils >>> textPrompt = “<|start_header_id|>user<|end_header_id|>

<sent>nSummary above sentence in one word: <|eot_id|><|start_header_id|>assistant<|end_header_id|>

>>> textDesc = "A cat sitting in a box."
>>> nullImageDF = spark.createDataFrame(spark.sparkContext.parallelize([EmbeddingsDataFrameUtils.emptyImageRow]), EmbeddingsDataFrameUtils.imageSchema)
>>> textDF = nullImageDF.withColumn("text", lit(textPrompt.replace("<sent>", textDesc)))
>>> e5vEmbeddings = E5VEmbeddings.pretrained()     ...     .setInputCols(["image"])     ...     .setOutputCol("e5v")
>>> result = e5vEmbeddings.transform(textDF)
>>> result.select("e5v.embeddings").show(truncate = False)
name = 'E5VEmbeddings'[source]#
inputAnnotatorTypes[source]#
outputAnnotatorType = 'sentence_embeddings'[source]#
static loadSavedModel(folder, spark_session, use_openvino=False)[source]#

Loads a locally saved model.

Parameters:
folderstr

Folder of the saved model

spark_sessionpyspark.sql.SparkSession

The current SparkSession

use_openvinobool, optional

Whether to use OpenVINO engine, by default False

Returns:
E5VEmbeddings

The restored model

static pretrained(name='e5v_int4', lang='en', remote_loc=None)[source]#

Downloads and loads a pretrained model.

Parameters:
namestr, optional

Name of the pretrained model, by default “e5v_int4”

langstr, optional

Language of the pretrained model, by default “en”

remote_locstr, optional

Optional remote address of the resource, by default None. Will use Spark NLPs repositories otherwise.

Returns:
E5VEmbeddings

The restored model