Text Generation

Text Generation is a natural language processing task where models create new text based on a given input. Instead of assigning labels, these models expand, complete, or rephrase text in a coherent way. For example, given “Once upon a time,” a text generation model might continue with “we knew that our ancestors were on the verge of extinction…”. This task includes both completion models, which predict the next word in a sequence to build longer passages, and text-to-text models, which map one piece of text to another for tasks like translation, summarization, or classification.

Depending on how they are trained, text generation models come in different variants: base models (e.g., Mistral 7B, Llama 3 70B) suited for fine-tuning, instruction-tuned models (e.g., Qwen 2, Yi 1.5, Llama 70B Instruct) that follow prompts like “Write a recipe for chocolate cake”, and human feedback models, which use RLHF to align outputs with human preferences. These capabilities make text generation useful for a wide range of applications, from chatbots and creative writing to code generation and summarization, with larger models typically producing more fluent and context-aware outputs.

Picking a Model

When picking a model for text generation, start by clarifying your goal—whether you need completions, rephrasings, translations, summaries, or creative writing. Base models like Mistral 7B or Llama 3 70B are good for fine-tuning, while instruction-tuned ones such as Qwen 2 or Llama 70B Instruct work better out of the box for prompts like “Write a recipe for chocolate cake.” Human-feedback models trained with RLHF usually give the most user-aligned responses. For quick or lightweight tasks, smaller models are efficient, while larger ones generally produce more fluent, context-aware text suited for chatbots, code generation, and long-form writing. For specific tasks, Pegasus, BART, or Llama 3 Instruct are strong for summarization; MarianMT, M2M-100, or NLLB excel at translation; GPT-based models, Llama 3 70B Instruct, and Yi 1.5 are strong for creative writing; Code Llama, StarCoder, and GPT-4 Turbo (Code) are well-suited for code generation; and for dialogue, Llama 3 Instruct, Qwen 2, and GPT-4 provide reliable conversational performance.

Explore models tailored for text generation at Spark NLP Models

How to use

from sparknlp.base import DocumentAssembler
from sparknlp.annotator import AutoGGUFModel
from pyspark.ml import Pipeline

document_assembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

auto_gguf_model = AutoGGUFModel.pretrained("qwen3_4b_q4_k_m_gguf", "en") \
    .setInputCols(["document"]) \
    .setOutputCol("completions") \
    .setBatchSize(4) \
    .setNPredict(20) \
    .setNGpuLayers(99) \
    .setTemperature(0.4) \
    .setTopK(40) \
    .setTopP(0.9) \
    .setPenalizeNl(True)

pipeline = Pipeline().setStages([
    document_assembler,
    auto_gguf_model
])

data = spark.createDataFrame([
    ["A farmer has 17 sheep. All but 9 run away. How many sheep does the farmer have left?"]
]).toDF("text")

model = pipeline.fit(data)
result = model.transform(data)

import com.johnsnowlabs.nlp.base._
import com.johnsnowlabs.nlp.annotators._
import org.apache.spark.ml.Pipeline

val documentAssembler = new DocumentAssembler()
  .setInputCol("text")
  .setOutputCol("document")

val autoGGUFModel = AutoGGUFModel.pretrained("qwen3_4b_q4_k_m_gguf", "en")
  .setInputCols("document")
  .setOutputCol("completions")
  .setBatchSize(4)
  .setNPredict(20)
  .setNGpuLayers(99)
  .setTemperature(0.4f)
  .setTopK(40)
  .setTopP(0.9f)
  .setPenalizeNl(true)

val pipeline = new Pipeline().setStages(Array(
  documentAssembler,
  autoGGUFModel
))

val data = Seq("A farmer has 17 sheep. All but 9 run away. How many sheep does the farmer have left?").toDF("text")

val model = pipeline.fit(data)
val result = model.transform(data)

Explanation:
The phrase "all but 9 run away" means that 9 sheep did not run away, while the remaining (17 - 9 = 8) did. Therefore, the farmer still has the 9 sheep that stayed behind.
Answer: 9.

Try Real-Time Demos!

If you want to see the outputs of text generation models in real time, visit our interactive demos:

Useful Resources

Want to dive deeper into text generation with Spark NLP? Here are some curated resources to help you get started and explore further:

Articles and Guides

Notebooks

PREVIOUSSpark NLP FAQ