Context Spell Checker Pipeline for English

Description

This pretrained spellchecker pipeline is built on the top of spellcheck_dl model. This pipeline is for PySpark 2.4.x users with SparkNLP 3.4.2 and above.

Predicted Entities

Live Demo Open in Colab Download Copy S3 URI

How to use


pipeline = PretrainedPipeline("spellcheck_dl_pipeline", lang = "en")

text = ["During the summer we have the best ueather.", "I have a black ueather jacket, so nice."]

pipeline.annotate(text)

val pipeline = new PretrainedPipeline("spellcheck_dl_pipeline", lang = "en")

val example = Array("During the summer we have the best ueather.", "I have a black ueather jacket, so nice.")

pipeline.annotate(example)

Results


[{'checked': ['During', 'the', 'summer', 'we', 'have', 'the', 'best', 'weather', '.'],
  'document': ['During the summer we have the best ueather.'],
  'token': ['During', 'the', 'summer', 'we', 'have', 'the', 'best', 'ueather', '.']},

 {'checked': ['I', 'have', 'a', 'black', 'leather', 'jacket', ',', 'so', 'nice',  '.'],
  'document': ['I have a black ueather jacket, so nice.'],
  'token': ['I', 'have', 'a', 'black', 'ueather', 'jacket', ',', 'so', 'nice', '.']}]

Model Information

Model Name: spellcheck_dl_pipeline
Type: pipeline
Compatibility: Spark NLP 4.0.0+
License: Open Source
Edition: Community
Language: en
Size: 99.8 MB

Included Models

  • DocumentAssembler
  • TokenizerModel
  • ContextSpellCheckerModel