package pragmatic
- Alphabetic
- Public
- All
Type Members
-
class
PragmaticScorer extends Serializable
Scorer is a rule based implementation inspired on http://fjavieralba.com/basic-sentiment-analysis-with-python.html Its strategy is to tag words by a dictionary in a sentence context, and later identify such context to get amplifiers
-
class
SentimentDetector extends AnnotatorApproach[SentimentDetectorModel]
Trains a rule based sentiment detector, which calculates a score based on predefined keywords.
Trains a rule based sentiment detector, which calculates a score based on predefined keywords.
A dictionary of predefined sentiment keywords must be provided with
setDictionary
, where each line is a word delimited to its class (eitherpositive
ornegative
). The dictionary can be set in either in the form of a delimited text file or directly as an ExternalResource.By default, the sentiment score will be assigned labels
"positive"
if the score is>= 0
, else"negative"
. To retrieve the raw sentiment scores,enableScore
needs to be set totrue
.For extended examples of usage, see the Examples and the SentimentTestSpec.
Example
In this example, the dictionary
default-sentiment-dict.txt
has the form of... cool,positive superb,positive bad,negative uninspired,negative ...
where each sentiment keyword is delimited by
","
.import spark.implicits._ import com.johnsnowlabs.nlp.DocumentAssembler import com.johnsnowlabs.nlp.annotator.Tokenizer import com.johnsnowlabs.nlp.annotators.Lemmatizer import com.johnsnowlabs.nlp.annotators.sda.pragmatic.SentimentDetector import com.johnsnowlabs.nlp.util.io.ReadAs import org.apache.spark.ml.Pipeline val documentAssembler = new DocumentAssembler() .setInputCol("text") .setOutputCol("document") val tokenizer = new Tokenizer() .setInputCols("document") .setOutputCol("token") val lemmatizer = new Lemmatizer() .setInputCols("token") .setOutputCol("lemma") .setDictionary("src/test/resources/lemma-corpus-small/lemmas_small.txt", "->", "\t") val sentimentDetector = new SentimentDetector() .setInputCols("lemma", "document") .setOutputCol("sentimentScore") .setDictionary("src/test/resources/sentiment-corpus/default-sentiment-dict.txt", ",", ReadAs.TEXT) val pipeline = new Pipeline().setStages(Array( documentAssembler, tokenizer, lemmatizer, sentimentDetector, )) val data = Seq( "The staff of the restaurant is nice", "I recommend others to avoid because it is too expensive" ).toDF("text") val result = pipeline.fit(data).transform(data) result.selectExpr("sentimentScore.result").show(false) +----------+ // +------+ for enableScore set to true |result | // |result| +----------+ // +------+ |[positive]| // |[1.0] | |[negative]| // |[-2.0]| +----------+ // +------+
- See also
ViveknSentimentApproach for an alternative approach to sentiment extraction
-
class
SentimentDetectorModel extends AnnotatorModel[SentimentDetectorModel] with HasSimpleAnnotate[SentimentDetectorModel]
Rule based sentiment detector, which calculates a score based on predefined keywords.
Rule based sentiment detector, which calculates a score based on predefined keywords.
This is the instantiated model of the SentimentDetector. For training your own model, please see the documentation of that class.
A dictionary of predefined sentiment keywords must be provided with
setDictionary
, where each line is a word delimited to its class (eitherpositive
ornegative
). The dictionary can be set in either in the form of a delimited text file or directly as an ExternalResource.By default, the sentiment score will be assigned labels
"positive"
if the score is>= 0
, else"negative"
. To retrieve the raw sentiment scores,enableScore
needs to be set totrue
.For extended examples of usage, see the Examples and the SentimentTestSpec.
- See also
ViveknSentimentApproach for an alternative approach to sentiment extraction
Value Members
-
object
SentimentDetector extends DefaultParamsReadable[SentimentDetector] with Serializable
This is the companion object of SentimentDetector.
This is the companion object of SentimentDetector. Please refer to that class for the documentation.
- object SentimentDetectorModel extends ParamsAndFeaturesReadable[SentimentDetectorModel] with Serializable