Packages

package pragmatic

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. class PragmaticScorer extends Serializable

    Scorer is a rule based implementation inspired on http://fjavieralba.com/basic-sentiment-analysis-with-python.html Its strategy is to tag words by a dictionary in a sentence context, and later identify such context to get amplifiers

  2. class SentimentDetector extends AnnotatorApproach[SentimentDetectorModel]

    Trains a rule based sentiment detector, which calculates a score based on predefined keywords.

    Trains a rule based sentiment detector, which calculates a score based on predefined keywords.

    A dictionary of predefined sentiment keywords must be provided with setDictionary, where each line is a word delimited to its class (either positive or negative). The dictionary can be set in either in the form of a delimited text file or directly as an ExternalResource.

    By default, the sentiment score will be assigned labels "positive" if the score is >= 0, else "negative". To retrieve the raw sentiment scores, enableScore needs to be set to true.

    For extended examples of usage, see the Examples and the SentimentTestSpec.

    Example

    In this example, the dictionary default-sentiment-dict.txt has the form of

    ...
    cool,positive
    superb,positive
    bad,negative
    uninspired,negative
    ...

    where each sentiment keyword is delimited by ",".

    import spark.implicits._
    import com.johnsnowlabs.nlp.DocumentAssembler
    import com.johnsnowlabs.nlp.annotator.Tokenizer
    import com.johnsnowlabs.nlp.annotators.Lemmatizer
    import com.johnsnowlabs.nlp.annotators.sda.pragmatic.SentimentDetector
    import com.johnsnowlabs.nlp.util.io.ReadAs
    import org.apache.spark.ml.Pipeline
    
    val documentAssembler = new DocumentAssembler()
      .setInputCol("text")
      .setOutputCol("document")
    
    val tokenizer = new Tokenizer()
      .setInputCols("document")
      .setOutputCol("token")
    
    val lemmatizer = new Lemmatizer()
      .setInputCols("token")
      .setOutputCol("lemma")
      .setDictionary("src/test/resources/lemma-corpus-small/lemmas_small.txt", "->", "\t")
    
    val sentimentDetector = new SentimentDetector()
      .setInputCols("lemma", "document")
      .setOutputCol("sentimentScore")
      .setDictionary("src/test/resources/sentiment-corpus/default-sentiment-dict.txt", ",", ReadAs.TEXT)
    
    val pipeline = new Pipeline().setStages(Array(
      documentAssembler,
      tokenizer,
      lemmatizer,
      sentimentDetector,
    ))
    
    val data = Seq(
      "The staff of the restaurant is nice",
      "I recommend others to avoid because it is too expensive"
    ).toDF("text")
    val result = pipeline.fit(data).transform(data)
    
    result.selectExpr("sentimentScore.result").show(false)
    +----------+  //  +------+ for enableScore set to true
    |result    |  //  |result|
    +----------+  //  +------+
    |[positive]|  //  |[1.0] |
    |[negative]|  //  |[-2.0]|
    +----------+  //  +------+
    See also

    ViveknSentimentApproach for an alternative approach to sentiment extraction

  3. class SentimentDetectorModel extends AnnotatorModel[SentimentDetectorModel] with HasSimpleAnnotate[SentimentDetectorModel]

    Rule based sentiment detector, which calculates a score based on predefined keywords.

    Rule based sentiment detector, which calculates a score based on predefined keywords.

    This is the instantiated model of the SentimentDetector. For training your own model, please see the documentation of that class.

    A dictionary of predefined sentiment keywords must be provided with setDictionary, where each line is a word delimited to its class (either positive or negative). The dictionary can be set in either in the form of a delimited text file or directly as an ExternalResource.

    By default, the sentiment score will be assigned labels "positive" if the score is >= 0, else "negative". To retrieve the raw sentiment scores, enableScore needs to be set to true.

    For extended examples of usage, see the Examples and the SentimentTestSpec.

    See also

    ViveknSentimentApproach for an alternative approach to sentiment extraction

Value Members

  1. object SentimentDetector extends DefaultParamsReadable[SentimentDetector] with Serializable

    This is the companion object of SentimentDetector.

    This is the companion object of SentimentDetector. Please refer to that class for the documentation.

  2. object SentimentDetectorModel extends ParamsAndFeaturesReadable[SentimentDetectorModel] with Serializable

Ungrouped