package pretrained
- Alphabetic
- Public
- All
Type Members
-
case class
PretrainedPipeline(downloadName: String, lang: String = "en", source: String = ResourceDownloader.publicLoc, parseEmbeddingsVectors: Boolean = false, diskLocation: Option[String] = None) extends Product with Serializable
Represents a fully constructed and trained Spark NLP pipeline, ready to be used.
Represents a fully constructed and trained Spark NLP pipeline, ready to be used. This way, a whole pipeline can be defined in 1 line. Additionally, the LightPipeline version of the model can be retrieved with member
lightModel
.For more extended examples see the Pipelines page and our Github Model Repository for available pipeline models.
Example
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline import com.johnsnowlabs.nlp.SparkNLP val testData = spark.createDataFrame(Seq( (1, "Google has announced the release of a beta version of the popular TensorFlow machine learning library"), (2, "Donald John Trump (born June 14, 1946) is the 45th and current president of the United States") )).toDF("id", "text") val pipeline = PretrainedPipeline("explain_document_dl", lang="en") val annotation = pipeline.transform(testData) annotation.select("entities.result").show(false) /* +----------------------------------+ |result | +----------------------------------+ |[Google, TensorFlow] | |[Donald John Trump, United States]| +----------------------------------+ */
- downloadName
Name of the Pipeline Model
- lang
Language of the defined pipeline (Default: "en")
- source
Source where to get the Pipeline Model
-
case class
RepositoryMetadata(repoFolder: String, lastModified: Date, lastMetadataDownloaded: Date, metadata: List[ResourceMetadata]) extends Product with Serializable
Describes state of repository Repository could be any s3 folder that has metadata.json describing list of resources inside
Describes state of repository Repository could be any s3 folder that has metadata.json describing list of resources inside
- Attributes
- protected
- trait ResourceDownloader extends AnyRef
- case class ResourceMetadata(name: String, language: Option[String], libVersion: Option[Version], sparkVersion: Option[Version], readyToUse: Boolean, time: Timestamp, isZipped: Boolean = false, category: Option[String] = ..., checksum: String = "", annotator: Option[String] = None, engine: Option[String] = None) extends Ordered[ResourceMetadata] with Product with Serializable
- case class ResourceRequest(name: String, language: Option[String] = None, folder: String = ResourceDownloader.publicLoc, libVersion: Version = ResourceDownloader.libVersion, sparkVersion: Version = ResourceDownloader.sparkVersion) extends Product with Serializable
- class S3ResourceDownloader extends ResourceDownloader
Value Members
- object PretrainedPipeline extends Serializable
- object PythonResourceDownloader
- object ResourceDownloader
- object ResourceMetadata extends Serializable
- object ResourceType extends Enumeration