coref

package coref

Ordering

Alphabetic

Visibility

Public
All

Type Members

trait ReadSpanBertCorefTensorflowModel extends ReadTensorflowModel
trait ReadablePretrainedSpanBertCorefModel extends ParamsAndFeaturesReadable[SpanBertCorefModel] with HasPretrained[SpanBertCorefModel]

class SpanBertCorefModel extends AnnotatorModel[SpanBertCorefModel] with HasSimpleAnnotate[SpanBertCorefModel] with WriteTensorflowModel with HasEmbeddingsProperties with HasStorageRef with HasCaseSensitiveProperties with HasEngine

A coreference resolution model based on SpanBert

A coreference resolution model identifies expressions which refer to the same entity in a text. For example, given a sentence "John told Mary he would like to borrow a book from her." the model will link "he" to "John" and "her" to "Mary".

This model is based on SpanBert, which is fine-tuned on the OntoNotes 5.0 data set.

Pretrained models can be loaded with pretrained of the companion object:

val dependencyParserApproach = SpanBertCorefModel.pretrained()
  .setInputCols("sentence", "token")
  .setOutputCol("corefs")

The default model is "spanbert_base_coref", if no name is provided. For available pretrained models please see the Models Hub.

For extended examples of usage, see the Examples

References:

https://github.com/mandarjoshi90/coref

Example

import spark.implicits._
import com.johnsnowlabs.nlp.base._
import com.johnsnowlabs.nlp.annotator._
import org.apache.spark.ml.Pipeline

val documentAssembler = new DocumentAssembler()
  .setInputCol("text")
  .setOutputCol("document")

val sentence = new SentenceDetector()
  .setInputCols("document")
  .setOutputCol("sentence")

val tokenizer = new Tokenizer()
  .setInputCols("sentence")
  .setOutputCol("token")

val corefResolution = SpanBertCorefModel.pretrained()
  .setInputCols("sentence", "token")
  .setOutputCol("corefs")

val pipeline = new Pipeline().setStages(Array(
  documentAssembler,
  sentence,
  tokenizer,
  corefResolution
))

val data = Seq(
  "John told Mary he would like to borrow a book from her."
).toDF("text")

val result = pipeline.fit(data).transform(data)

result.selectExpr(""explode(corefs) AS coref"")
  .selectExpr("coref.result as token", "coref.metadata").show(truncate = false)
+-----+------------------------------------------------------------------------------------+
|token|metadata                                                                            |
+-----+------------------------------------------------------------------------------------+
|John |{head.sentence -> -1, head -> ROOT, head.begin -> -1, head.end -> -1, sentence -> 0}|
|he   |{head.sentence -> 0, head -> John, head.begin -> 0, head.end -> 3, sentence -> 0}   |
|Mary |{head.sentence -> -1, head -> ROOT, head.begin -> -1, head.end -> -1, sentence -> 0}|
|her  |{head.sentence -> 0, head -> Mary, head.begin -> 10, head.end -> 13, sentence -> 0} |
+-----+------------------------------------------------------------------------------------+

Value Members

object SpanBertCorefModel extends ReadablePretrainedSpanBertCorefModel with ReadSpanBertCorefTensorflowModel with Serializable
This is the companion object of SpanBertCorefModel.
This is the companion object of SpanBertCorefModel. Please refer to that class for the documentation.

Packages

coref

package coref

Type Members

Example

Value Members

Ungrouped

Packages

coref 

package coref

Type Members

Example

Value Members

Ungrouped

coref