Context Spell Checker Pipeline for English

Description

This pretrained spellchecker pipeline is built on the top of spellcheck_dl model. This pipeline is for PySpark 2.4.x users with SparkNLP 3.4.2 and above.

Predicted Entities

Live Demo Open in Colab Download Copy S3 URI

How to use

pipeline = PretrainedPipeline("spellcheck_dl_pipeline", lang = "en")

text = ["During the summer we have the best ueather.", "I have a black ueather jacket, so nice."]

pipeline.annotate(text)

val pipeline = new PretrainedPipeline("spellcheck_dl_pipeline", lang = "en")

val example = Array("During the summer we have the best ueather.", "I have a black ueather jacket, so nice.")

pipeline.annotate(example)

Results

[{'checked': ['During', 'the', 'summer', 'we', 'have', 'the', 'best', 'weather', '.'],
  'document': ['During the summer we have the best ueather.'],
  'token': ['During', 'the', 'summer', 'we', 'have', 'the', 'best', 'ueather', '.']},

 {'checked': ['I', 'have', 'a', 'black', 'leather', 'jacket', ',', 'so', 'nice',  '.'],
  'document': ['I have a black ueather jacket, so nice.'],
  'token': ['I', 'have', 'a', 'black', 'ueather', 'jacket', ',', 'so', 'nice', '.']}]

Model Information

Model Name:	spellcheck_dl_pipeline
Type:	pipeline
Compatibility:	Spark NLP 4.0.0+
License:	Open Source
Edition:	Community
Language:	en
Size:	99.8 MB

Included Models

DocumentAssembler
TokenizerModel
ContextSpellCheckerModel

PREVIOUSPipeline to Detect Time-related Terminology

NEXTXLM-RoBERTa Base, CoNLL-03 NER Pipeline