Match Chunks in Texts

Description

The pipeline uses regex <DT/>?/<JJ/>*<NN>+

Open in Colab Download Copy S3 URI

How to use

from sparknlp.pretrained import PretrainedPipeline

pipeline_local = PretrainedPipeline('match_chunks')

result = pipeline_local.annotate("David visited the restaurant yesterday with his family. He also visited and the day before, but at that time he was alone. David again visited today with his colleagues. He and his friends really liked the food and hoped to visit again tomorrow.")

result['chunk']
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline import com.johnsnowlabs.nlp.SparkNLP

SparkNLP.version()

val testData = spark.createDataFrame(Seq( (1, "David visited the restaurant yesterday with his family. He also visited and the day before, but at that time he was alone. David again visited today with his colleagues. He and his friends really liked the food and hoped to visit again tomorrow."))).toDF("id", "text")

val pipeline = PretrainedPipeline("match_chunks", lang="en")

val annotation = pipeline.transform(testData)

annotation.show()
import nlu
nlu.load("en.match.chunks").predict("""David visited the restaurant yesterday with his family. He also visited and the day before, but at that time he was alone. David again visited today with his colleagues. He and his friends really liked the food and hoped to visit again tomorrow.""")

Results

['the restaurant yesterday',
'family',
'the day',
'that time',
'today',
'the food',
'tomorrow']

Model Information

Model Name: match_chunks
Type: pipeline
Compatibility: Spark NLP 3.3.4+
License: Open Source
Edition: Official
Language: en
Size: 4.1 MB

Included Models

  • DocumentAssembler
  • SentenceDetector
  • TokenizerModel
  • PerceptronModel
  • Chunker