Description
The explain_document_lg is a pre-trained pipeline that we can use to process text with a simple pipeline that performs basic processing steps and recognizes entities. It performs most of the common text processing tasks on your dataframe
How to use
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline('explain_document_lg', lang = 'ko')
annotations = pipeline.fullAnnotate(""안녕하세요, 환영합니다!"")[0]
annotations.keys()
val pipeline = new PretrainedPipeline("explain_document_lg", lang = "ko")
val result = pipeline.fullAnnotate("안녕하세요, 환영합니다!")(0)
import nlu
nlu.load("ko.explain_document").predict("""안녕하세요, 환영합니다!""")
Results
+------------------------+--------------------------+--------------------------+--------------------------------+----------------------------+---------------------+
|text |document |sentence |token |ner |ner_chunk |
+------------------------+--------------------------+--------------------------+--------------------------------+----------------------------+---------------------+
|안녕, 존 스노우!|[안녕, 존 스노우!]|[안녕, 존 스노우!]|[안녕, ,, 존, 스노우, !] |[B-DATE, O, O, O, O]| [안녕] |
+------------------------+--------------------------+--------------------------+--------------------------------+----------------------------+---------------------+
Model Information
Model Name: | explain_document_lg |
Type: | pipeline |
Compatibility: | Spark NLP 3.0.2+ |
License: | Open Source |
Edition: | Official |
Language: | ko |
Included Models
- DocumentAssembler
- SentenceDetector
- WordSegmenterModel
- WordEmbeddingsModel
- NerDLModel
- NerConverter