Packages

class BertForMultipleChoice extends AnnotatorModel[BertForMultipleChoice] with HasBatchedAnnotate[BertForMultipleChoice] with WriteOnnxModel with WriteOpenvinoModel with HasCaseSensitiveProperties with HasEngine

BertForMultipleChoice can load BERT Models with a multiple choice classification head on top (a linear layer on top of the pooled output and a softmax) e.g. for RocStories/SWAG tasks.

Pretrained models can be loaded with pretrained of the companion object:

val spanClassifier = BertForMultipleChoice.pretrained()
  .setInputCols(Array("document_question", "document_context"))
  .setOutputCol("answer")

The default model is "bert_base_uncased_multiple_choice", if no name is provided.

For available pretrained models please see the Models Hub.

Models from the HuggingFace 🤗 Transformers library are also compatible with Spark NLP 🚀. To see which models are compatible and how to import them see https://github.com/JohnSnowLabs/spark-nlp/discussions/5669 and to see more extended examples, see BertForMultipleChoiceTestSpec.

Example

import spark.implicits._
import com.johnsnowlabs.nlp.base._
import com.johnsnowlabs.nlp.annotator._
import org.apache.spark.ml.Pipeline

val document = new MultiDocumentAssembler()
  .setInputCols("question", "context")
  .setOutputCols("document_question", "document_context")

val questionAnswering = BertForMultipleChoice.pretrained()
  .setInputCols(Array("document_question", "document_context"))
  .setOutputCol("answer")
  .setCaseSensitive(false)

val pipeline = new Pipeline().setStages(Array(
  document,
  questionAnswering
))

val data = Seq("The Eiffel Tower is located in which country?", "Germany, France, Italy").toDF("question", "context")
val result = pipeline.fit(data).transform(data)

result.select("answer.result").show(false)
+---------------------+
|result               |
+---------------------+
|[France]              |
++--------------------+
See also

BertForQuestionAnswering for Question Answering tasks

Annotators Main Page for a list of transformer based classifiers

Ordering
  1. Grouped
  2. Alphabetic
  3. By Inheritance
Inherited
  1. BertForMultipleChoice
  2. HasEngine
  3. HasCaseSensitiveProperties
  4. WriteOpenvinoModel
  5. WriteOnnxModel
  6. HasBatchedAnnotate
  7. AnnotatorModel
  8. CanBeLazy
  9. RawAnnotator
  10. HasOutputAnnotationCol
  11. HasInputAnnotationCols
  12. HasOutputAnnotatorType
  13. ParamsAndFeaturesWritable
  14. HasFeatures
  15. DefaultParamsWritable
  16. MLWritable
  17. Model
  18. Transformer
  19. PipelineStage
  20. Logging
  21. Params
  22. Serializable
  23. Serializable
  24. Identifiable
  25. AnyRef
  26. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Parameters

A list of (hyper-)parameter keys this annotator can take. Users can set and get the parameter values through setters and getters, respectively.

  1. val batchSize: IntParam

    Size of every batch (Default depends on model).

    Size of every batch (Default depends on model).

    Definition Classes
    HasBatchedAnnotate
  2. val caseSensitive: BooleanParam

    Whether to ignore case in index lookups (Default depends on model)

    Whether to ignore case in index lookups (Default depends on model)

    Definition Classes
    HasCaseSensitiveProperties
  3. val engine: Param[String]

    This param is set internally once via loadSavedModel.

    This param is set internally once via loadSavedModel. That's why there is no setter

    Definition Classes
    HasEngine
  4. val maxSentenceLength: IntParam

    Max sentence length to process (Default: 512)

  5. val vocabulary: MapFeature[String, Int]

    Vocabulary used to encode the words to ids with WordPieceEncoder

Members

  1. type AnnotatorType = String
    Definition Classes
    HasOutputAnnotatorType
  1. def batchAnnotate(batchedAnnotations: Seq[Array[Annotation]]): Seq[Seq[Annotation]]

    takes a document and annotations and produces new annotations of this annotator's annotation type

    takes a document and annotations and produces new annotations of this annotator's annotation type

    batchedAnnotations

    Annotations in batches that correspond to inputAnnotationCols generated by previous annotators if any

    returns

    any number of annotations processed for every batch of input annotations. Not necessary one to one relationship IMPORTANT: !MUST! return sequences of equal lengths !! IMPORTANT: !MUST! return sentences that belong to the same original row !! (challenging)

    Definition Classes
    BertForMultipleChoiceHasBatchedAnnotate
  2. def batchProcess(rows: Iterator[_]): Iterator[Row]
    Definition Classes
    HasBatchedAnnotate
  3. val choicesDelimiter: Param[String]
  4. final def clear(param: Param[_]): BertForMultipleChoice.this.type
    Definition Classes
    Params
  5. def copy(extra: ParamMap): BertForMultipleChoice

    requirement for annotators copies

    requirement for annotators copies

    Definition Classes
    RawAnnotator → Model → Transformer → PipelineStage → Params
  6. def explainParam(param: Param[_]): String
    Definition Classes
    Params
  7. def explainParams(): String
    Definition Classes
    Params
  8. final def extractParamMap(): ParamMap
    Definition Classes
    Params
  9. final def extractParamMap(extra: ParamMap): ParamMap
    Definition Classes
    Params
  10. val features: ArrayBuffer[Feature[_, _, _]]
    Definition Classes
    HasFeatures
  11. final def get[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  12. final def getDefault[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  13. def getInputCols: Array[String]

    returns

    input annotations columns currently used

    Definition Classes
    HasInputAnnotationCols
  14. def getLazyAnnotator: Boolean
    Definition Classes
    CanBeLazy
  15. final def getOrDefault[T](param: Param[T]): T
    Definition Classes
    Params
  16. final def getOutputCol: String

    Gets annotation column name going to generate

    Gets annotation column name going to generate

    Definition Classes
    HasOutputAnnotationCol
  17. def getParam(paramName: String): Param[Any]
    Definition Classes
    Params
  18. final def hasDefault[T](param: Param[T]): Boolean
    Definition Classes
    Params
  19. def hasParam(paramName: String): Boolean
    Definition Classes
    Params
  20. def hasParent: Boolean
    Definition Classes
    Model
  21. val inputAnnotatorTypes: Array[AnnotatorType]

    Annotator reference id.

    Annotator reference id. Used to identify elements in metadata or to refer to this annotator type

    Definition Classes
    BertForMultipleChoiceHasInputAnnotationCols
  22. final def isDefined(param: Param[_]): Boolean
    Definition Classes
    Params
  23. final def isSet(param: Param[_]): Boolean
    Definition Classes
    Params
  24. val lazyAnnotator: BooleanParam
    Definition Classes
    CanBeLazy
  25. def onWrite(path: String, spark: SparkSession): Unit
  26. val optionalInputAnnotatorTypes: Array[String]
    Definition Classes
    HasInputAnnotationCols
  27. val outputAnnotatorType: AnnotatorType
  28. lazy val params: Array[Param[_]]
    Definition Classes
    Params
  29. var parent: Estimator[BertForMultipleChoice]
    Definition Classes
    Model
  30. def save(path: String): Unit
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  31. final def set[T](param: Param[T], value: T): BertForMultipleChoice.this.type
    Definition Classes
    Params
  32. def setChoicesDelimiter(value: String): BertForMultipleChoice.this.type
  33. final def setInputCols(value: String*): BertForMultipleChoice.this.type
    Definition Classes
    HasInputAnnotationCols
  34. def setInputCols(value: Array[String]): BertForMultipleChoice.this.type

    Overrides required annotators column if different than default

    Overrides required annotators column if different than default

    Definition Classes
    HasInputAnnotationCols
  35. def setLazyAnnotator(value: Boolean): BertForMultipleChoice.this.type
    Definition Classes
    CanBeLazy
  36. final def setOutputCol(value: String): BertForMultipleChoice.this.type

    Overrides annotation column name when transforming

    Overrides annotation column name when transforming

    Definition Classes
    HasOutputAnnotationCol
  37. def setParent(parent: Estimator[BertForMultipleChoice]): BertForMultipleChoice
    Definition Classes
    Model
  38. def toString(): String
    Definition Classes
    Identifiable → AnyRef → Any
  39. final def transform(dataset: Dataset[_]): DataFrame

    Given requirements are met, this applies ML transformation within a Pipeline or stand-alone Output annotation will be generated as a new column, previous annotations are still available separately metadata is built at schema level to record annotations structural information outside its content

    Given requirements are met, this applies ML transformation within a Pipeline or stand-alone Output annotation will be generated as a new column, previous annotations are still available separately metadata is built at schema level to record annotations structural information outside its content

    dataset

    Dataset[Row]

    Definition Classes
    AnnotatorModel → Transformer
  40. def transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" )
  41. def transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" ) @varargs()
  42. final def transformSchema(schema: StructType): StructType

    requirement for pipeline transformation validation.

    requirement for pipeline transformation validation. It is called on fit()

    Definition Classes
    RawAnnotator → PipelineStage
  43. val uid: String
    Definition Classes
    BertForMultipleChoice → Identifiable
  44. def write: MLWriter
    Definition Classes
    ParamsAndFeaturesWritable → DefaultParamsWritable → MLWritable
  45. def writeOnnxModel(path: String, spark: SparkSession, onnxWrapper: OnnxWrapper, suffix: String, fileName: String): Unit
    Definition Classes
    WriteOnnxModel
  46. def writeOnnxModels(path: String, spark: SparkSession, onnxWrappersWithNames: Seq[(OnnxWrapper, String)], suffix: String): Unit
    Definition Classes
    WriteOnnxModel
  47. def writeOpenvinoModel(path: String, spark: SparkSession, openvinoWrapper: OpenvinoWrapper, suffix: String, fileName: String): Unit
    Definition Classes
    WriteOpenvinoModel
  48. def writeOpenvinoModels(path: String, spark: SparkSession, ovWrappersWithNames: Seq[(OpenvinoWrapper, String)], suffix: String): Unit
    Definition Classes
    WriteOpenvinoModel

Parameter setters

  1. def sentenceEndTokenId: Int

  2. def sentenceStartTokenId: Int

  3. def setBatchSize(size: Int): BertForMultipleChoice.this.type

    Size of every batch.

    Size of every batch.

    Definition Classes
    HasBatchedAnnotate
  4. def setCaseSensitive(value: Boolean): BertForMultipleChoice.this.type

    Definition Classes
    HasCaseSensitiveProperties
  5. def setMaxSentenceLength(value: Int): BertForMultipleChoice.this.type

  6. def setModelIfNotSet(spark: SparkSession, tensorflowWrapper: Option[TensorflowWrapper], onnxWrapper: Option[OnnxWrapper], openvinoWrapper: Option[OpenvinoWrapper]): BertForMultipleChoice

  7. def setVocabulary(value: Map[String, Int]): BertForMultipleChoice.this.type

Parameter getters

  1. def getBatchSize: Int

    Size of every batch.

    Size of every batch.

    Definition Classes
    HasBatchedAnnotate
  2. def getCaseSensitive: Boolean

    Definition Classes
    HasCaseSensitiveProperties
  3. def getEngine: String

    Definition Classes
    HasEngine
  4. def getModelIfNotSet: BertClassification