sparknlp.annotator.date2_chunk
#
Contains classes for Date2Chunk.
Module Contents#
Classes#
Converts |
- class Date2Chunk[source]#
Converts
DATE
type Annotations toCHUNK
type.This can be useful if the following annotators after DateMatcher and MultiDateMatcher require
`CHUNK`
types.- Parameters:
- entityName
Entity name for the metadata, by default
"DATE"
.
Examples
>>> from pyspark.ml import Pipeline
>>> import sparknlp >>> from sparknlp.base import * >>> from sparknlp.annotator import * >>> documentAssembler = DocumentAssembler() \ ... .setInputCol("text") \ ... .setOutputCol("document") >>> date = DateMatcher() \ ... .setInputCols(["document"]) \ ... .setOutputCol("date") >>> date2Chunk = Date2Chunk() \ ... .setInputCols(["date"]) \ ... .setOutputCol("date_chunk") >>> pipeline = Pipeline().setStages([ ... documentAssembler, ... date, ... date2Chunk ... ]) >>> data = spark.createDataFrame([["Omicron is a new variant of COVID-19, which the World Health Organization designated a variant of concern on Nov. 26, 2021/26/11."]]).toDF("text") >>> result = pipeline.fit(data).transform(data) >>> result.select("date_chunk").show(1, truncate=False) ----------------------------------------------------+ |date_chunk | ----------------------------------------------------+ |[{chunk, 118, 121, 2021/01/01, {sentence -> 0}, []}]| ----------------------------------------------------+