Explain Document DL Pipeline for English

Description

The explain_document_dl is a pretrained pipeline that we can use to process text with a simple pipeline that performs basic processing steps and recognizes entities . It performs most of the common text processing tasks on your dataframe

Predicted Entities

Download Copy S3 URI

How to use

from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline('explain_document_dl', lang = 'en')
annotations =  pipeline.fullAnnotate("The Mona Lisa is an oil painting from the 16th century.")[0]
annotations.keys()

val pipeline = new PretrainedPipeline("explain_document_dl", lang = "en")
val result = pipeline.fullAnnotate("The Mona Lisa is an oil painting from the 16th century.")(0)

import nlu
text = ["The Mona Lisa is an oil painting from the 16th century."]
result_df = nlu.load('en.explain.dl').predict(text)
result_df

Results

Results


+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------+-----------+
|                                              text|                                          document|                                          sentence|                                             token|                                           checked|                                             lemma|                                              stem|                                               pos|                                        embeddings|                                         ner|   entities|
+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------+-----------+
|The Mona Lisa is an oil painting from the 16th ...|[The Mona Lisa is an oil painting from the 16th...|[The Mona Lisa is an oil painting from the 16th...|[The, Mona, Lisa, is, an, oil, painting, from, ...|[The, Mona, Lisa, is, an, oil, painting, from, ...|[The, Mona, Lisa, be, an, oil, painting, from, ...|[the, mona, lisa, i, an, oil, paint, from, the,...|[DT, NNP, NNP, VBZ, DT, NN, NN, IN, DT, JJ, NN, .]|[[-0.038194, -0.24487, 0.72812, -0.39961, 0.083...|[O, B-PER, I-PER, O, O, O, O, O, O, O, O, O]|[Mona Lisa]|
+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------+-----------+


{:.model-param}

Model Information

Model Name:	explain_document_dl
Type:	pipeline
Compatibility:	Spark NLP 4.4.2+
License:	Open Source
Edition:	Official
Language:	en
Size:	176.2 MB

Included Models

DocumentAssembler
SentenceDetector
TokenizerModel
NorvigSweetingModel
LemmatizerModel
Stemmer
PerceptronModel
WordEmbeddingsModel
NerDLModel
NerConverter

PREVIOUSRecognize Entities DL Pipeline for Swedish - Small

NEXTExplain Document pipeline for Danish (explain_document_lg)