Description
This pretrained spellchecker pipeline is built on the top of spellcheck_dl model. This pipeline is for PySpark 2.4.x users with SparkNLP 3.4.2 and above.
Predicted Entities
Live Demo Open in Colab Download Copy S3 URI
How to use
pipeline = PretrainedPipeline("spellcheck_dl_pipeline", lang = "en")
text = ["During the summer we have the best ueather.", "I have a black ueather jacket, so nice."]
pipeline.annotate(text)
val pipeline = new PretrainedPipeline("spellcheck_dl_pipeline", lang = "en")
val example = Array("During the summer we have the best ueather.", "I have a black ueather jacket, so nice.")
pipeline.annotate(example)
Results
[{'checked': ['During', 'the', 'summer', 'we', 'have', 'the', 'best', 'weather', '.'],
  'document': ['During the summer we have the best ueather.'],
  'token': ['During', 'the', 'summer', 'we', 'have', 'the', 'best', 'ueather', '.']},
 {'checked': ['I', 'have', 'a', 'black', 'leather', 'jacket', ',', 'so', 'nice',  '.'],
  'document': ['I have a black ueather jacket, so nice.'],
  'token': ['I', 'have', 'a', 'black', 'ueather', 'jacket', ',', 'so', 'nice', '.']}]
Model Information
| Model Name: | spellcheck_dl_pipeline | 
| Type: | pipeline | 
| Compatibility: | Spark NLP 4.0.0+ | 
| License: | Open Source | 
| Edition: | Community | 
| Language: | en | 
| Size: | 99.8 MB | 
Included Models
- DocumentAssembler
- TokenizerModel
- ContextSpellCheckerModel