pretrained

package pretrained

Ordering

Alphabetic

Visibility

Public
All

Type Members

case class PretrainedPipeline(downloadName: String, lang: String = "en", source: String = ResourceDownloader.publicLoc, parseEmbeddingsVectors: Boolean = false, diskLocation: Option[String] = None) extends Product with Serializable

Represents a fully constructed and trained Spark NLP pipeline, ready to be used.

Represents a fully constructed and trained Spark NLP pipeline, ready to be used. This way, a whole pipeline can be defined in 1 line. Additionally, the LightPipeline version of the model can be retrieved with member lightModel.

For more extended examples see the Pipelines page and our Github Model Repository for available pipeline models.

Example

import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
import com.johnsnowlabs.nlp.SparkNLP
val testData = spark.createDataFrame(Seq(
(1, "Google has announced the release of a beta version of the popular TensorFlow machine learning library"),
(2, "Donald John Trump (born June 14, 1946) is the 45th and current president of the United States")
)).toDF("id", "text")

val pipeline = PretrainedPipeline("explain_document_dl", lang="en")

val annotation = pipeline.transform(testData)

annotation.select("entities.result").show(false)

/*
+----------------------------------+
|result                            |
+----------------------------------+
|[Google, TensorFlow]              |
|[Donald John Trump, United States]|
+----------------------------------+
*/

downloadName: Name of the Pipeline Model
lang: Language of the defined pipeline (Default: "en")
source: Source where to get the Pipeline Model

case class RepositoryMetadata(repoFolder: String, lastModified: Date, lastMetadataDownloaded: Date, metadata: List[ResourceMetadata]) extends Product with Serializable
Describes state of repository Repository could be any s3 folder that has metadata.json describing list of resources inside
Describes state of repository Repository could be any s3 folder that has metadata.json describing list of resources inside

Attributes
protected
trait ResourceDownloader extends AnyRef
case class ResourceMetadata(name: String, language: Option[String], libVersion: Option[Version], sparkVersion: Option[Version], readyToUse: Boolean, time: Timestamp, isZipped: Boolean = false, category: Option[String] = ..., checksum: String = "", annotator: Option[String] = None, engine: Option[String] = None) extends Ordered[ResourceMetadata] with Product with Serializable
case class ResourceRequest(name: String, language: Option[String] = None, folder: String = ResourceDownloader.publicLoc, libVersion: Version = ResourceDownloader.libVersion, sparkVersion: Version = ResourceDownloader.sparkVersion) extends Product with Serializable
class S3ResourceDownloader extends ResourceDownloader

Value Members

object PretrainedPipeline extends Serializable
object PythonResourceDownloader
object ResourceDownloader
object ResourceMetadata extends Serializable
object ResourceType extends Enumeration

Packages

pretrained

package pretrained

Type Members

Example

Value Members

Ungrouped

Packages

pretrained 

package pretrained

Type Members

Example

Value Members

Ungrouped

pretrained