Basic NLP Pipeline for Spanish from TEMU_BSC for PlanTL

Description

Pretrained Basic NLP pipeline, by TEMU-BSC for PlanTL-GOB-ES, with Tokenization, lemmatization, NER, embeddings and Normalization, using roberta_base_bne transformer.

Download Copy S3 URI

How to use

import sparknlp spark = sparknlp.start()

from sparknlp.annotator import * from sparknlp.base import * pipeline = PretrainedPipeline(“pipeline_bsc_roberta_base_bne”, “es”, “@cayorodriguez”) from sparknlp.base import LightPipeline

light_model = LightPipeline(pipeline) text = “La Reserva Federal de el Gobierno de EE UU aprueba una de las mayorores subidas de tipos de interés desde 1994.” light_result = light_model.annotate(text)

result = pipeline.annotate(““Veo al hombre de los Estados Unidos con el telescopio””)

import sparknlp
spark = sparknlp.start()

from sparknlp.annotator import *
from sparknlp.base import *
pipeline = PretrainedPipeline("pipeline_bsc_roberta_base_bne", "es", "@cayorodriguez")
from sparknlp.base import LightPipeline

light_model = LightPipeline(pipeline)
text = "La Reserva Federal de el Gobierno de EE UU aprueba una de las mayorores subidas de tipos de interés desde 1994."
light_result = light_model.annotate(text)


result = pipeline.annotate(""Veo al hombre de los Estados Unidos con el telescopio"")

Model Information

Model Name: pipeline_bsc_roberta_base_bne
Type: pipeline
Compatibility: Spark NLP 4.0.0+
License: Open Source
Edition: Community
Language: es
Size: 2.0 GB
Dependencies: roberta_base_bne

Included Models

  • DocumentAssembler
  • SentenceDetectorDLModel
  • TokenizerModel
  • NormalizerModel
  • StopWordsCleaner
  • RoBertaEmbeddings
  • SentenceEmbeddings
  • EmbeddingsFinisher
  • LemmatizerModel
  • RoBertaForTokenClassification
  • RoBertaForTokenClassification
  • NerConverter