Packages

package dep

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. class DependencyParserApproach extends AnnotatorApproach[DependencyParserModel]

    Trains an unlabeled parser that finds a grammatical relations between two words in a sentence.

    Trains an unlabeled parser that finds a grammatical relations between two words in a sentence.

    For instantiated/pretrained models, see DependencyParserModel.

    Dependency parser provides information about word relationship. For example, dependency parsing can tell you what the subjects and objects of a verb are, as well as which words are modifying (describing) the subject. This can help you find precise answers to specific questions.

    The required training data can be set in two different ways (only one can be chosen for a particular model):

    Apart from that, no additional training data is needed.

    See DependencyParserApproachTestSpec for further reference on how to use this API.

    Example

    import spark.implicits._
    import com.johnsnowlabs.nlp.base.DocumentAssembler
    import com.johnsnowlabs.nlp.annotators.sbd.pragmatic.SentenceDetector
    import com.johnsnowlabs.nlp.annotators.Tokenizer
    import com.johnsnowlabs.nlp.annotators.pos.perceptron.PerceptronModel
    import com.johnsnowlabs.nlp.annotators.parser.dep.DependencyParserApproach
    import org.apache.spark.ml.Pipeline
    
    val documentAssembler = new DocumentAssembler()
      .setInputCol("text")
      .setOutputCol("document")
    
    val sentence = new SentenceDetector()
      .setInputCols("document")
      .setOutputCol("sentence")
    
    val tokenizer = new Tokenizer()
      .setInputCols("sentence")
      .setOutputCol("token")
    
    val posTagger = PerceptronModel.pretrained()
      .setInputCols("sentence", "token")
      .setOutputCol("pos")
    
    val dependencyParserApproach = new DependencyParserApproach()
      .setInputCols("sentence", "pos", "token")
      .setOutputCol("dependency")
      .setDependencyTreeBank("src/test/resources/parser/unlabeled/dependency_treebank")
    
    val pipeline = new Pipeline().setStages(Array(
      documentAssembler,
      sentence,
      tokenizer,
      posTagger,
      dependencyParserApproach
    ))
    
    // Additional training data is not needed, the dependency parser relies on the dependency tree bank / CoNLL-U only.
    val emptyDataSet = Seq.empty[String].toDF("text")
    val pipelineModel = pipeline.fit(emptyDataSet)
    See also

    TypedDependencyParserApproach to extract labels for the dependencies

  2. class DependencyParserModel extends AnnotatorModel[DependencyParserModel] with HasSimpleAnnotate[DependencyParserModel]

    Unlabeled parser that finds a grammatical relation between two words in a sentence.

    Unlabeled parser that finds a grammatical relation between two words in a sentence.

    Dependency parser provides information about word relationship. For example, dependency parsing can tell you what the subjects and objects of a verb are, as well as which words are modifying (describing) the subject. This can help you find precise answers to specific questions.

    This is the instantiated model of the DependencyParserApproach. For training your own model, please see the documentation of that class.

    Pretrained models can be loaded with pretrained of the companion object:

    val dependencyParserApproach = DependencyParserModel.pretrained()
      .setInputCols("sentence", "pos", "token")
      .setOutputCol("dependency")

    The default model is "dependency_conllu", if no name is provided. For available pretrained models please see the Models Hub.

    For extended examples of usage, see the Examples and the DependencyParserApproachTestSpec.

    Example

    import spark.implicits._
    import com.johnsnowlabs.nlp.base.DocumentAssembler
    import com.johnsnowlabs.nlp.annotators.Tokenizer
    import com.johnsnowlabs.nlp.annotators.parser.dep.DependencyParserModel
    import com.johnsnowlabs.nlp.annotators.pos.perceptron.PerceptronModel
    import com.johnsnowlabs.nlp.annotators.sbd.pragmatic.SentenceDetector
    import org.apache.spark.ml.Pipeline
    
    val documentAssembler = new DocumentAssembler()
      .setInputCol("text")
      .setOutputCol("document")
    
    val sentence = new SentenceDetector()
      .setInputCols("document")
      .setOutputCol("sentence")
    
    val tokenizer = new Tokenizer()
      .setInputCols("sentence")
      .setOutputCol("token")
    
    val posTagger = PerceptronModel.pretrained()
      .setInputCols("sentence", "token")
      .setOutputCol("pos")
    
    val dependencyParser = DependencyParserModel.pretrained()
      .setInputCols("sentence", "pos", "token")
      .setOutputCol("dependency")
    
    val pipeline = new Pipeline().setStages(Array(
      documentAssembler,
      sentence,
      tokenizer,
      posTagger,
      dependencyParser
    ))
    
    val data = Seq(
      "Unions representing workers at Turner Newall say they are 'disappointed' after talks with stricken parent " +
        "firm Federal Mogul."
    ).toDF("text")
    val result = pipeline.fit(data).transform(data)
    
    result.selectExpr("explode(arrays_zip(token.result, dependency.result)) as cols")
      .selectExpr("cols['0'] as token", "cols['1'] as dependency").show(8, truncate = false)
    +------------+------------+
    |token       |dependency  |
    +------------+------------+
    |Unions      |ROOT        |
    |representing|workers     |
    |workers     |Unions      |
    |at          |Turner      |
    |Turner      |workers     |
    |Newall      |say         |
    |say         |Unions      |
    |they        |disappointed|
    +------------+------------+
    See also

    TypedDependencyParserMdoel to extract labels for the dependencies

  3. class Perceptron extends Serializable
  4. trait ReadablePretrainedDependency extends ParamsAndFeaturesReadable[DependencyParserModel] with HasPretrained[DependencyParserModel]
  5. class Tagger extends Serializable

Value Members

  1. object DependencyParserApproach extends DefaultParamsReadable[DependencyParserApproach] with Serializable

    This is the companion object of DependencyParserApproach.

    This is the companion object of DependencyParserApproach. Please refer to that class for the documentation.

  2. object DependencyParserModel extends ReadablePretrainedDependency with Serializable

    This is the companion object of DependencyParserModel.

    This is the companion object of DependencyParserModel. Please refer to that class for the documentation.

  3. object TagDictionary
  4. object Tagger extends Serializable

Ungrouped