c

com.johnsnowlabs.ml.tensorflow

ClassifierDatasetEncoder

class ClassifierDatasetEncoder extends Serializable

Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ClassifierDatasetEncoder
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ClassifierDatasetEncoder(params: ClassifierDatasetEncoderParams)

Value Members

  1. def calculateEmbeddingsDim(dataset: DataFrame): Int
  2. def collectTrainingInstances(dataset: DataFrame, labelCol: String): Array[Array[(String, Array[Float])]]

    Converts DataFrame to Array of Arrays of Labels (string)

    Converts DataFrame to Array of Arrays of Labels (string)

    dataset

    Input DataFrame with embeddings and labels

    returns

    Array of Array of Map(String, Array(Float))

  3. def collectTrainingInstancesMultiLabel(dataset: DataFrame, labelCol: String): Array[Array[(Array[String], Array[Float])]]

    Converts DataFrame to labels and embeddings

    Converts DataFrame to labels and embeddings

    dataset

    Input DataFrame with embeddings and labels

    returns

    Array of Array of Map(Array(String), Array(Float))

  4. def decodeOutputData(tagIds: Array[Array[Float]]): Array[Array[(String, Float)]]

    Converts Tag Identifiers to Tag Names

    Converts Tag Identifiers to Tag Names

    tagIds

    Tag Ids encoded for Tensorflow Model.

    returns

    Tag names

  5. def encodeTags(labels: Array[String]): Array[Array[Int]]
  6. def encodeTagsMultiLabel(labels: Array[Array[String]]): Array[Array[Float]]
  7. def extractLabels(dataset: Array[Array[(String, Array[Float])]]): Array[String]

    Converts DataFrame to Array of Arrays of Labels (string)

    Converts DataFrame to Array of Arrays of Labels (string)

    dataset

    Input DataFrame with labels

    returns

    Array of Array of String

  8. def extractLabelsMultiLabel(dataset: Array[Array[(Array[String], Array[Float])]]): Array[Array[String]]

    Converts DataFrame to Array of Arrays of Labels (string)

    Converts DataFrame to Array of Arrays of Labels (string)

    dataset

    Input DataFrame with labels

    returns

    Array of Array of String

  9. def extractSentenceEmbeddings(docs: Seq[(Int, Seq[Annotation])]): Array[Array[Float]]

    Converts DataFrame to Array of Arrays of Embeddings

    Converts DataFrame to Array of Arrays of Embeddings

    docs

    Input DataFrame with sentence_embeddings

    returns

    Array of Array of Float

  10. def extractSentenceEmbeddings(dataset: Array[Array[(String, Array[Float])]]): Array[Array[Float]]

    Converts DataFrame to Array of Arrays of Embeddings

    Converts DataFrame to Array of Arrays of Embeddings

    dataset

    Input DataFrame with sentence_embeddings

    returns

    Array of Array of Float

  11. def extractSentenceEmbeddingsMultiLabel(docs: Seq[(Int, Seq[Annotation])]): Array[Array[Array[Float]]]

    Converts DataFrame to Array of arrays of arrays of arrays of Embeddings The difference in this function is to create a sequence in case of multiple sentences in a document Used in MultiClassifierDL

    Converts DataFrame to Array of arrays of arrays of arrays of Embeddings The difference in this function is to create a sequence in case of multiple sentences in a document Used in MultiClassifierDL

    docs

    Input DataFrame with sentence_embeddings

    returns

    Array of Arrays of Arrays of Floats

  12. def extractSentenceEmbeddingsMultiLabel(dataset: Array[Array[(Array[String], Array[Float])]]): Array[Array[Array[Float]]]

    Converts DataFrame to Array of arrays of arrays of arrays of Embeddings The difference in this function is to create a sequence in case of multiple sentences in a document Used in MultiClassifierDL

    Converts DataFrame to Array of arrays of arrays of arrays of Embeddings The difference in this function is to create a sequence in case of multiple sentences in a document Used in MultiClassifierDL

    dataset

    Input DataFrame with sentence_embeddings

    returns

    Array of Arrays of Arrays of Floats

  13. def extractSentenceEmbeddingsMultiLabelPredict(docs: Seq[(Int, Seq[Annotation])]): Array[Array[Array[Float]]]
  14. val params: ClassifierDatasetEncoderParams
  15. val tags: Array[String]
  16. val tags2Id: Map[String, Int]