Sentiment Analysis of German news

Description

This model was imported from Hugging Face (link) and it’s been finetuned on news texts about migration for German language, leveraging Bert embeddings and BertForSequenceClassification for text classification purposes.

Predicted Entities

positive, negative, neutral

Download Copy S3 URI

How to use

document_assembler = DocumentAssembler() \
.setInputCol('text') \
.setOutputCol('document')

tokenizer = Tokenizer() \
.setInputCols(['document']) \
.setOutputCol('token')

sequenceClassifier = BertForSequenceClassification \
.pretrained('bert_sequence_classifier_news_sentiment', 'de') \
.setInputCols(['token', 'document']) \
.setOutputCol('class')

pipeline = Pipeline(stages=[document_assembler, tokenizer, sequenceClassifier])

example = spark.createDataFrame([['Die Zahl der Flüchtlinge in Deutschland steigt von Tag zu Tag.']]).toDF("text")
result = pipeline.fit(example).transform(example)
val document_assembler = DocumentAssembler() 
.setInputCol("text") 
.setOutputCol("document")

val tokenizer = Tokenizer() 
.setInputCols("document") 
.setOutputCol("token")

val tokenClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_news_sentiment", "de")
.setInputCols("document", "token")
.setOutputCol("class")

val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, sequenceClassifier))

val example = Seq.empty["Die Zahl der Flüchtlinge in Deutschland steigt von Tag zu Tag."].toDS.toDF("text")

val result = pipeline.fit(example).transform(example)
import nlu
nlu.load("de.classify.news_sentiment.bert").predict("""Die Zahl der Flüchtlinge in Deutschland steigt von Tag zu Tag.""")

Results

['neutral']

Model Information

Model Name: bert_sequence_classifier_news_sentiment
Compatibility: Spark NLP 3.3.4+
License: Open Source
Edition: Official
Input Labels: [document, token]
Output Labels: [class]
Language: de
Size: 408.7 MB
Case sensitive: true
Max sentence length: 512

Data Source

https://wortschatz.uni-leipzig.de/en/download/German