Translate Marathi to English Pipeline

Description

Marian is an efficient, free Neural Machine Translation framework written in pure C++ with minimal dependencies. It is mainly being developed by the Microsoft Translator team. Many academic (most notably the University of Edinburgh and in the past the Adam Mickiewicz University in Poznań) and commercial contributors help with its development.

It is currently the engine behind the Microsoft Translator Neural Machine Translation services and being deployed by many companies, organizations and research projects (see below for an incomplete list).

Note that this is a very computationally expensive module especially on larger sequence. The use of an accelerator such as GPU is recommended.

  • source languages: mr

  • target languages: en

Live Demo Open in Colab Download Copy S3 URI

How to use

from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("translate_mr_en", lang = "xx")

result = pipeline.annotate("मला वाचायला आवडते.")
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val pipeline = new PretrainedPipeline("translate_mr_en", lang = "xx")

val result = pipeline.annotate("मला वाचायला आवडते.")
import nlu

text = ["मला वाचायला आवडते."]

translate_df = nlu.load('xx.mr.translate_to.en').predict(text, output_level='sentence')

translate_df

Results

+------------------------------+---------------------------+
|sentence                      |translation                |
+------------------------------+---------------------------+
|मला वाचायला आवडते.               |I like reading.            | 
+------------------------------+---------------------------+

Model Information

Model Name: translate_mr_en
Type: pipeline
Compatibility: Spark NLP 2.7.0+
Edition: Official
Language: xx

Data Source

https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models