Packages

  • package root
    Definition Classes
    root
  • package com
    Definition Classes
    root
  • package johnsnowlabs
    Definition Classes
    com
  • package nlp
    Definition Classes
    johnsnowlabs
  • package annotators
    Definition Classes
    nlp
  • package spell
    Definition Classes
    annotators
  • package context
    Definition Classes
    spell
  • class ContextSpellCheckerModel extends AnnotatorModel[ContextSpellCheckerModel] with HasSimpleAnnotate[ContextSpellCheckerModel] with WeightedLevenshtein with WriteTensorflowModel with ParamsAndFeaturesWritable with HasTransducerFeatures with HasEngine

    Implements a deep-learning based Noisy Channel Model Spell Algorithm.

    Implements a deep-learning based Noisy Channel Model Spell Algorithm. Correction candidates are extracted combining context information and word information.

    Spell Checking is a sequence to sequence mapping problem. Given an input sequence, potentially containing a certain number of errors, ContextSpellChecker will rank correction sequences according to three things:

    1. Different correction candidates for each word — word level.
    2. The surrounding text of each word, i.e. it’s context — sentence level.
    3. The relative cost of different correction candidates according to the edit operations at the character level it requires — subword level.

    For an in-depth explanation of the module see the article Applying Context Aware Spell Checking in Spark NLP.

    This is the instantiated model of the ContextSpellCheckerApproach. For training your own model, please see the documentation of that class.

    Pretrained models can be loaded with pretrained of the companion object:

    val spellChecker = ContextSpellCheckerModel.pretrained()
      .setInputCols("token")
      .setOutputCol("checked")

    The default model is "spellcheck_dl", if no name is provided. For available pretrained models please see the Models Hub.

    For extended examples of usage, see the Examples and the ContextSpellCheckerTestSpec.

    Example

    import spark.implicits._
    import com.johnsnowlabs.nlp.DocumentAssembler
    import com.johnsnowlabs.nlp.annotators.Tokenizer
    import com.johnsnowlabs.nlp.annotators.spell.context.ContextSpellCheckerModel
    import org.apache.spark.ml.Pipeline
    
    val documentAssembler = new DocumentAssembler()
      .setInputCol("text")
      .setOutputCol("doc")
    
    val tokenizer = new Tokenizer()
      .setInputCols(Array("doc"))
      .setOutputCol("token")
    
    val spellChecker = ContextSpellCheckerModel
      .pretrained()
      .setTradeOff(12.0f)
      .setInputCols("token")
      .setOutputCol("checked")
    
    val pipeline = new Pipeline().setStages(Array(
      documentAssembler,
      tokenizer,
      spellChecker
    ))
    
    val data = Seq("It was a cold , dreary day and the country was white with smow .").toDF("text")
    val result = pipeline.fit(data).transform(data)
    
    result.select("checked.result").show(false)
    +--------------------------------------------------------------------------------+
    |result                                                                          |
    +--------------------------------------------------------------------------------+
    |[It, was, a, cold, ,, dreary, day, and, the, country, was, white, with, snow, .]|
    +--------------------------------------------------------------------------------+
    Definition Classes
    context
    See also

    NorvigSweetingModel and SymmetricDeleteModel for alternative approaches to spell checking

  • StringTools

implicit class StringTools extends AnyRef

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. StringTools
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new StringTools(s: String)

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def capitalizeFirstLetter(): String
  6. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  7. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  8. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  9. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  12. def isFirstLetterCapitalized(): Boolean
  13. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  14. def isLowerCase(): Boolean
  15. def isUpperCase(): Boolean
  16. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  17. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  18. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  19. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  20. def toString(): String
    Definition Classes
    AnyRef → Any
  21. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  22. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  23. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from AnyRef

Inherited from Any

Members