Description
The explain_document_dl
is a pretrained pipeline that we can use to process text with a simple pipeline that performs basic processing steps.
Open in Colab Download Copy S3 URI
How to use
pipeline = PretrainedPipeline('explain_document_dl', lang = 'en')
annotations = pipeline.fullAnnotate("""French author who helped pioner the science-fiction genre. Verne wrate about space, air, and underwater travel before navigable aircrast and practical submarines were invented, and before any means of space travel had been devised.""")[0]
annotations.keys()
val pipeline = new PretrainedPipeline('explain_document_dl', lang = 'en')
val result = pipeline.fullAnnotate("French author who helped pioner the science-fiction genre. Verne wrate about space, air, and underwater travel before navigable aircrast and practical submarines were invented, and before any means of space travel had been devised.")(0)
import nlu
text = ["""John Snow built a detailed map of all the households where people died, and came to the conclusion that the fault was one public water pump that all the victims had used."""]
explain_df = nlu.load('en.explain.dl').predict(text)
explain_df
Results
+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
| text| document| sentence| token| spell| lemmas| stems| pos|
+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
|French author who...|[[document, 0, 23...|[[document, 0, 57...|[[token, 0, 5, Fr...|[[token, 0, 5, Fr...|[[token, 0, 5, Fr...|[[token, 0, 5, fr...|[[pos, 0, 5, JJ, ...|
+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
Model Information
Model Name: | explain_document_dl |
Type: | pipeline |
Compatibility: | Spark NLP 2.5.5+ |
License: | Open Source |
Edition: | Community |
Language: | [en] |
Included Models
The explain_document_dl has one Transformer and six annotators:
- Documenssembler - A Transformer that creates a column that contains documents.
- Sentence Segmenter - An annotator that produces the sentences of the document.
- Tokenizer - An annotator that produces the tokens of the sentences.
- SpellChecker - An annotator that produces the spelling-corrected tokens.
- Stemmer - An annotator that produces the stems of the tokens.
- Lemmatizer - An annotator that produces the lemmas of the tokens.
- POS Tagger - An annotator that produces the parts of speech of the associated tokens.