`sparknlp.annotator.cv.florence2_transformer`#

Module Contents#

Classes#

Florence2Transformer

Florence2Transformer can load Florence-2 models for a variety of vision and vision-language tasks using prompt-based inference.

class Florence2Transformer(classname='com.johnsnowlabs.nlp.annotators.cv.Florence2Transformer', java_model=None)[source]#

Florence2Transformer can load Florence-2 models for a variety of vision and vision-language tasks using prompt-based inference.

The model supports image captioning, object detection, segmentation, OCR, and more, using prompt tokens as described in the Florence-2 documentation.

Pretrained models can be loaded with pretrained() of the companion object:

>>> florence2 = Florence2Transformer.pretrained()     ...     .setInputCols(["image_assembler"])     ...     .setOutputCol("answer")

The default model is "florence2_base_ft_int4", if no name is provided.

For available pretrained models please see the Models Hub.

Input Annotation types	Output Annotation type
`IMAGE`	`DOCUMENT`

Parameters:

batchSize: Batch size. Large values allows faster processing but requires more memory, by default 2
maxOutputLength: Maximum length of output text, by default 200
minOutputLength: Minimum length of the sequence to be generated, by default 10
doSample: Whether or not to use sampling; use greedy decoding otherwise, by default False
temperature: The value used to module the next token probabilities, by default 1.0
topK: The number of highest probability vocabulary tokens to keep for top-k-filtering, by default 50
topP: If set to float < 1, only the most probable tokens with probabilities that add up to top_p or higher are kept for generation, by default 1.0
repetitionPenalty: The parameter for repetition penalty. 1.0 means no penalty, by default 1.0
noRepeatNgramSize: If set to int > 0, all ngrams of that size can only occur once, by default 3
ignoreTokenIds: A list of token ids which are ignored in the decoder’s output, by default []
beamSize: The Number of beams for beam search, by default 1

Examples

>>> import sparknlp
>>> from sparknlp.base import *
>>> from sparknlp.annotator import *
>>> from pyspark.ml import Pipeline
>>> image_df = spark.read.format("image").load(path=images_path)
>>> test_df = image_df.withColumn("text", lit("<OD>"))
>>> imageAssembler = ImageAssembler()     ...     .setInputCol("image")     ...     .setOutputCol("image_assembler")
>>> florence2 = Florence2Transformer.pretrained()     ...     .setInputCols(["image_assembler"])     ...     .setOutputCol("answer")
>>> pipeline = Pipeline().setStages([
...     imageAssembler,
...     florence2
... ])
>>> result = pipeline.fit(test_df).transform(test_df)
>>> result.select("image_assembler.origin", "answer.result").show(False)

name = 'Florence2Transformer'[source]#

inputAnnotatorTypes[source]#

outputAnnotatorType = 'document'[source]#

minOutputLength[source]#

maxOutputLength[source]#

doSample[source]#

temperature[source]#

topK[source]#

topP[source]#

repetitionPenalty[source]#

noRepeatNgramSize[source]#

ignoreTokenIds[source]#

beamSize[source]#

batchSize[source]#

setMinOutputLength(value)[source]#: Sets minimum length of the sequence to be generated.

setMaxOutputLength(value)[source]#: Sets maximum length of output text.

setDoSample(value)[source]#: Sets whether or not to use sampling; use greedy decoding otherwise.

setTemperature(value)[source]#: Sets the value used to module the next token probabilities.

setTopK(value)[source]#: Sets the number of highest probability vocabulary tokens to keep for top-k-filtering.

setTopP(value)[source]#: Sets the top cumulative probability for vocabulary tokens.

setRepetitionPenalty(value)[source]#: Sets the parameter for repetition penalty. 1.0 means no penalty.

setNoRepeatNgramSize(value)[source]#: Sets size of n-grams that can only occur once.

setIgnoreTokenIds(value)[source]#: A list of token ids which are ignored in the decoder’s output.

setBeamSize(value)[source]#: Sets the number of beams for beam search.

setBatchSize(value)[source]#: Sets the batch size.

static loadSavedModel(folder, spark_session, use_openvino=False)[source]#: Loads a locally saved model.

static pretrained(name='florence2_base_ft_int4', lang='en', remote_loc=None)[source]#: Downloads and loads a pretrained model.

sparknlp.annotator.cv.florence2_transformer#

Module Contents#

Classes#

`sparknlp.annotator.cv.florence2_transformer`#