Description
The explain_document_lg is a pre-trained pipeline that we can use to process text with a simple pipeline that performs basic processing steps and recognizes entities. It performs most of the common text processing tasks on your dataframe
How to use
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline('explain_document_lg', lang = 'he')
annotations = pipeline.fullAnnotate(""היי, מעבדות ג'ון סנו!"")[0]
annotations.keys()
val pipeline = new PretrainedPipeline("explain_document_lg", lang = "he")
val result = pipeline.fullAnnotate("היי, מעבדות ג'ון סנו!")(0)
import nlu
nlu.load("he.explain_document").predict("""היי, מעבדות ג'ון סנו!""")
Results
+----------------------+------------------------+----------------------+---------------------------+--------------------+---------+
| text| document| sentence| token| ner|ner_chunk|
+----------------------+------------------------+----------------------+---------------------------+--------------------+---------+
| היי ג'ון מעבדות שלג! |[ היי ג'ון מעבדות שלג! ]|[היי ג'ון מעבדות שלג!]|[היי, ג'ון, מעבדות, שלג, !]|[O, B-PERS, O, O, O]| [ג'ון]|
+----------------------+------------------------+----------------------+---------------------------+--------------------+---------+
Model Information
Model Name: | explain_document_lg |
Type: | pipeline |
Compatibility: | Spark NLP 3.0.2+ |
License: | Open Source |
Edition: | Official |
Language: | he |
Included Models
- DocumentAssembler
- SentenceDetector
- TokenizerModel
- WordEmbeddingsModel
- NerDLModel
- NerConverter