BART (large-sized model), fine-tuned on CNN Daily Mail

Description

BART model pre-trained on English language, and fine-tuned on CNN Daily Mail. It was introduced in the paper BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension by Lewis et al. and first released in [this repository (https://github.com/pytorch/fairseq/tree/master/examples/bart).

Disclaimer: The team releasing BART did not write a model card for this model so this model card has been written by the Hugging Face team.

Model description BART is a transformer encoder-encoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. BART is pre-trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text.

BART is particularly effective when fine-tuned for text generation (e.g. summarization, translation) but also works well for comprehension tasks (e.g. text classification, question answering). This particular checkpoint has been fine-tuned on CNN Daily Mail, a large collection of text-summary pairs

Download Copy S3 URI

How to use


bart = BartTransformer.pretrained("bart_large_cnn")             .setTask("summarize:")             .setMaxOutputLength(200)             .setInputCols(["documents"])             .setOutputCol("summaries")


val bart = BartTransformer.pretrained("bart_large_cnn")
            .setTask("summarize:")
            .setMaxOutputLength(200)
            .setInputCols("documents")
            .setOutputCol("summaries")

Model Information

Model Name: bart_large_cnn
Compatibility: Spark NLP 5.4.0+
License: Open Source
Edition: Official
Input Labels: [documents]
Output Labels: [generation]
Language: en
Size: 974.9 MB