class BLIPForQuestionAnswering extends AnnotatorModel[BLIPForQuestionAnswering] with HasBatchedAnnotateImage[BLIPForQuestionAnswering] with HasImageFeatureProperties with WriteTensorflowModel with HasEngine

BLIPForQuestionAnswering can load BLIP models for visual question answering. The model consists of a vision encoder, a text encoder as well as a text decoder. The vision encoder will encode the input image, the text encoder will encode the input question together with the encoding of the image, and the text decoder will output the answer to the question.

Pretrained models can be loaded with pretrained of the companion object:

val visualQAClassifier = BLIPForQuestionAnswering.pretrained()
  .setInputCols("image_assembler")
  .setOutputCol("answer")

The default model is "blip_vqa_base", if no name is provided.

For available pretrained models please see the Models Hub.

Models from the HuggingFace 🤗 Transformers library are also compatible with Spark NLP 🚀. To see which models are compatible and how to import them see https://github.com/JohnSnowLabs/spark-nlp/discussions/5669 and to see more extended examples, see https://github.com/JohnSnowLabs/spark-nlp/blob/master/src/test/scala/com/johnsnowlabs/nlp/annotators/cv/BLIPForQuestionAnsweringTest.scala.

Example

import spark.implicits._
import com.johnsnowlabs.nlp.base._
import com.johnsnowlabs.nlp.annotator._
import org.apache.spark.ml.Pipeline

val imageDF: DataFrame = ResourceHelper.spark.read
 .format("image")
 .option("dropInvalid", value = true)
 .load(imageFolder)

val testDF: DataFrame = imageDF.withColumn("text", lit("What's this picture about?"))

val imageAssembler: ImageAssembler = new ImageAssembler()
  .setInputCol("image")
  .setOutputCol("image_assembler")

val visualQAClassifier = BLIPForQuestionAnswering.pretrained()
  .setInputCols("image_assembler")
  .setOutputCol("answer")

val pipeline = new Pipeline().setStages(Array(
  imageAssembler,
  visualQAClassifier
))

val result = pipeline.fit(testDF).transform(testDF)

result.select("image_assembler.origin", "answer.result").show(false)
+--------------------------------------+------+
|origin                                |result|
+--------------------------------------+------+
|[file:///content/images/cat_image.jpg]|[cats]|
+--------------------------------------+------+
See also

CLIPForZeroShotClassification for Zero Shot Image Classifier

Annotators Main Page for a list of transformer based classifiers

Ordering
  1. Grouped
  2. Alphabetic
  3. By Inheritance
Inherited
  1. BLIPForQuestionAnswering
  2. HasEngine
  3. WriteTensorflowModel
  4. HasImageFeatureProperties
  5. HasBatchedAnnotateImage
  6. AnnotatorModel
  7. CanBeLazy
  8. RawAnnotator
  9. HasOutputAnnotationCol
  10. HasInputAnnotationCols
  11. HasOutputAnnotatorType
  12. ParamsAndFeaturesWritable
  13. HasFeatures
  14. DefaultParamsWritable
  15. MLWritable
  16. Model
  17. Transformer
  18. PipelineStage
  19. Logging
  20. Params
  21. Serializable
  22. Serializable
  23. Identifiable
  24. AnyRef
  25. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Parameters

A list of (hyper-)parameter keys this annotator can take. Users can set and get the parameter values through setters and getters, respectively.

  1. val batchSize: IntParam

    Size of every batch (Default depends on model).

    Size of every batch (Default depends on model).

    Definition Classes
    HasBatchedAnnotateImage
  2. val configProtoBytes: IntArrayParam

    ConfigProto from tensorflow, serialized into byte array.

    ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()

  3. val doNormalize: BooleanParam

    Whether or not to normalize the input with mean and standard deviation

    Whether or not to normalize the input with mean and standard deviation

    Definition Classes
    HasImageFeatureProperties
  4. val doResize: BooleanParam

    Whether to resize the input to a certain size

    Whether to resize the input to a certain size

    Definition Classes
    HasImageFeatureProperties
  5. val engine: Param[String]

    This param is set internally once via loadSavedModel.

    This param is set internally once via loadSavedModel. That's why there is no setter

    Definition Classes
    HasEngine
  6. val featureExtractorType: Param[String]

    Name of model's architecture for feature extraction

    Name of model's architecture for feature extraction

    Definition Classes
    HasImageFeatureProperties
  7. val imageMean: DoubleArrayParam

    The sequence of means for each channel, to be used when normalizing images

    The sequence of means for each channel, to be used when normalizing images

    Definition Classes
    HasImageFeatureProperties
  8. val imageStd: DoubleArrayParam

    The sequence of standard deviations for each channel, to be used when normalizing images

    The sequence of standard deviations for each channel, to be used when normalizing images

    Definition Classes
    HasImageFeatureProperties
  9. val maxSentenceLength: IntParam

    Max sentence length to process (Default: 512)

  10. val resample: IntParam

    An optional resampling filter.

    An optional resampling filter. This can be one of PIL.Image.NEAREST, PIL.Image.BOX, PIL.Image.BILINEAR, PIL.Image.HAMMING, PIL.Image.BICUBIC or PIL.Image.LANCZOS. Only has an effect if do_resize is set to True

    Definition Classes
    HasImageFeatureProperties
  11. val signatures: MapFeature[String, String]

    It contains TF model signatures for the laded saved model

  12. val size: IntParam

    Resize the input to the given size.

    Resize the input to the given size. If a tuple is provided, it should be (width, height). If only an integer is provided, then the input will be resized to (size, size). Only has an effect if do_resize is set to True.

    Definition Classes
    HasImageFeatureProperties
  13. val vocabulary: MapFeature[String, Int]

    Vocabulary used to encode the words to ids with WordPieceEncoder

Members

  1. type AnnotatorType = String
    Definition Classes
    HasOutputAnnotatorType
  1. def batchAnnotate(batchedAnnotations: Seq[Array[AnnotationImage]]): Seq[Seq[Annotation]]

    takes a document and annotations and produces new annotations of this annotator's annotation type

    takes a document and annotations and produces new annotations of this annotator's annotation type

    batchedAnnotations

    Annotations in batches that correspond to inputAnnotationCols generated by previous annotators if any

    returns

    any number of annotations processed for every batch of input annotations. Not necessary one to one relationship

    Definition Classes
    BLIPForQuestionAnsweringHasBatchedAnnotateImage
  2. def batchProcess(rows: Iterator[_]): Iterator[Row]
    Definition Classes
    HasBatchedAnnotateImage
  3. final def clear(param: Param[_]): BLIPForQuestionAnswering.this.type
    Definition Classes
    Params
  4. def copy(extra: ParamMap): BLIPForQuestionAnswering

    requirement for annotators copies

    requirement for annotators copies

    Definition Classes
    RawAnnotator → Model → Transformer → PipelineStage → Params
  5. def explainParam(param: Param[_]): String
    Definition Classes
    Params
  6. def explainParams(): String
    Definition Classes
    Params
  7. final def extractParamMap(): ParamMap
    Definition Classes
    Params
  8. final def extractParamMap(extra: ParamMap): ParamMap
    Definition Classes
    Params
  9. val features: ArrayBuffer[Feature[_, _, _]]
    Definition Classes
    HasFeatures
  10. final def get[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  11. final def getDefault[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  12. def getInputCols: Array[String]

    returns

    input annotations columns currently used

    Definition Classes
    HasInputAnnotationCols
  13. def getLazyAnnotator: Boolean
    Definition Classes
    CanBeLazy
  14. final def getOrDefault[T](param: Param[T]): T
    Definition Classes
    Params
  15. final def getOutputCol: String

    Gets annotation column name going to generate

    Gets annotation column name going to generate

    Definition Classes
    HasOutputAnnotationCol
  16. def getParam(paramName: String): Param[Any]
    Definition Classes
    Params
  17. final def hasDefault[T](param: Param[T]): Boolean
    Definition Classes
    Params
  18. def hasParam(paramName: String): Boolean
    Definition Classes
    Params
  19. def hasParent: Boolean
    Definition Classes
    Model
  20. val inputAnnotatorTypes: Array[AnnotatorType]

    Annotator reference id.

    Annotator reference id. Used to identify elements in metadata or to refer to this annotator type

    Definition Classes
    BLIPForQuestionAnsweringHasInputAnnotationCols
  21. final def isDefined(param: Param[_]): Boolean
    Definition Classes
    Params
  22. final def isSet(param: Param[_]): Boolean
    Definition Classes
    Params
  23. val lazyAnnotator: BooleanParam
    Definition Classes
    CanBeLazy
  24. def onWrite(path: String, spark: SparkSession): Unit
  25. val optionalInputAnnotatorTypes: Array[String]
    Definition Classes
    HasInputAnnotationCols
  26. val outputAnnotatorType: AnnotatorType
  27. lazy val params: Array[Param[_]]
    Definition Classes
    Params
  28. var parent: Estimator[BLIPForQuestionAnswering]
    Definition Classes
    Model
  29. def save(path: String): Unit
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  30. final def set[T](param: Param[T], value: T): BLIPForQuestionAnswering.this.type
    Definition Classes
    Params
  31. final def setInputCols(value: String*): BLIPForQuestionAnswering.this.type
    Definition Classes
    HasInputAnnotationCols
  32. def setInputCols(value: Array[String]): BLIPForQuestionAnswering.this.type

    Overrides required annotators column if different than default

    Overrides required annotators column if different than default

    Definition Classes
    HasInputAnnotationCols
  33. def setLazyAnnotator(value: Boolean): BLIPForQuestionAnswering.this.type
    Definition Classes
    CanBeLazy
  34. final def setOutputCol(value: String): BLIPForQuestionAnswering.this.type

    Overrides annotation column name when transforming

    Overrides annotation column name when transforming

    Definition Classes
    HasOutputAnnotationCol
  35. def setParent(parent: Estimator[BLIPForQuestionAnswering]): BLIPForQuestionAnswering
    Definition Classes
    Model
  36. def toString(): String
    Definition Classes
    Identifiable → AnyRef → Any
  37. final def transform(dataset: Dataset[_]): DataFrame

    Given requirements are met, this applies ML transformation within a Pipeline or stand-alone Output annotation will be generated as a new column, previous annotations are still available separately metadata is built at schema level to record annotations structural information outside its content

    Given requirements are met, this applies ML transformation within a Pipeline or stand-alone Output annotation will be generated as a new column, previous annotations are still available separately metadata is built at schema level to record annotations structural information outside its content

    dataset

    Dataset[Row]

    Definition Classes
    AnnotatorModel → Transformer
  38. def transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" )
  39. def transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" ) @varargs()
  40. final def transformSchema(schema: StructType): StructType

    requirement for pipeline transformation validation.

    requirement for pipeline transformation validation. It is called on fit()

    Definition Classes
    RawAnnotator → PipelineStage
  41. val uid: String
    Definition Classes
    BLIPForQuestionAnswering → Identifiable
  42. def write: MLWriter
    Definition Classes
    ParamsAndFeaturesWritable → DefaultParamsWritable → MLWritable
  43. def writeTensorflowHub(path: String, tfPath: String, spark: SparkSession, suffix: String = "_use"): Unit
    Definition Classes
    WriteTensorflowModel
  44. def writeTensorflowModel(path: String, spark: SparkSession, tensorflow: TensorflowWrapper, suffix: String, filename: String, configProtoBytes: Option[Array[Byte]] = None): Unit
    Definition Classes
    WriteTensorflowModel
  45. def writeTensorflowModelV2(path: String, spark: SparkSession, tensorflow: TensorflowWrapper, suffix: String, filename: String, configProtoBytes: Option[Array[Byte]] = None, savedSignatures: Option[Map[String, String]] = None): Unit
    Definition Classes
    WriteTensorflowModel

Parameter setters

  1. def setBatchSize(size: Int): BLIPForQuestionAnswering.this.type

    Size of every batch.

    Size of every batch.

    Definition Classes
    HasBatchedAnnotateImage
  2. def setConfigProtoBytes(bytes: Array[Int]): BLIPForQuestionAnswering.this.type

    ConfigProto from tensorflow, serialized into byte array.

    ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()

  3. def setDoNormalize(value: Boolean): BLIPForQuestionAnswering.this.type

    Definition Classes
    HasImageFeatureProperties
  4. def setDoResize(value: Boolean): BLIPForQuestionAnswering.this.type

    Definition Classes
    HasImageFeatureProperties
  5. def setFeatureExtractorType(value: String): BLIPForQuestionAnswering.this.type

    Definition Classes
    HasImageFeatureProperties
  6. def setImageMean(value: Array[Double]): BLIPForQuestionAnswering.this.type

    Definition Classes
    HasImageFeatureProperties
  7. def setImageStd(value: Array[Double]): BLIPForQuestionAnswering.this.type

    Definition Classes
    HasImageFeatureProperties
  8. def setMaxSentenceLength(value: Int): BLIPForQuestionAnswering.this.type

  9. def setModelIfNotSet(spark: SparkSession, preprocessor: Preprocessor, tensorflow: TensorflowWrapper): BLIPForQuestionAnswering.this.type

  10. def setResample(value: Int): BLIPForQuestionAnswering.this.type

    Definition Classes
    HasImageFeatureProperties
  11. def setSignatures(value: Map[String, String]): BLIPForQuestionAnswering.this.type

  12. def setSize(value: Int): BLIPForQuestionAnswering.this.type

    Definition Classes
    HasImageFeatureProperties
  13. def setVocabulary(value: Map[String, Int]): BLIPForQuestionAnswering.this.type

Parameter getters

  1. def getBatchSize: Int

    Size of every batch.

    Size of every batch.

    Definition Classes
    HasBatchedAnnotateImage
  2. def getConfigProtoBytes: Option[Array[Byte]]

    ConfigProto from tensorflow, serialized into byte array.

    ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()

  3. def getDoNormalize: Boolean

    Definition Classes
    HasImageFeatureProperties
  4. def getDoResize: Boolean

    Definition Classes
    HasImageFeatureProperties
  5. def getEngine: String

    Definition Classes
    HasEngine
  6. def getFeatureExtractorType: String

    Definition Classes
    HasImageFeatureProperties
  7. def getImageMean: Array[Double]

    Definition Classes
    HasImageFeatureProperties
  8. def getImageStd: Array[Double]

    Definition Classes
    HasImageFeatureProperties
  9. def getMaxSentenceLength: Int

  10. def getModelIfNotSet: BLIPClassifier

  11. def getResample: Int

    Definition Classes
    HasImageFeatureProperties
  12. def getSignatures: Option[Map[String, String]]

  13. def getSize: Int

    Definition Classes
    HasImageFeatureProperties