Qwen3-4B GGUF (F16 Quantized) by Qwen

Description

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support

Original model from https://huggingface.co/Qwen/Qwen3-4B

Download Copy S3 URI

How to use

from sparknlp.base import DocumentAssembler
from sparknlp.annotator import AutoGGUFModel
from pyspark.ml import Pipeline

document_assembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

auto_gguf_model = AutoGGUFModel.pretrained("qwen3_4b_bf16_gguf", "en") \
    .setInputCols(["document"]) \
    .setOutputCol("completions") \
    .setBatchSize(4) \
    .setNPredict(-1) \
    .setNGpuLayers(99) \
    .setTemperature(0.4) \
    .setTopK(40) \
    .setTopP(0.9) \
    .setPenalizeNl(True)

pipeline = Pipeline().setStages([
    document_assembler,
    auto_gguf_model
])

data = spark.createDataFrame([
    ["Give me a short introduction to large language model."]
]).toDF("text")

model = pipeline.fit(data)
result = model.transform(data)

result.select("completions").show(truncate=False)

import com.johnsnowlabs.nlp.base.DocumentAssembler
import com.johnsnowlabs.nlp.annotators.auto.gguf.AutoGGUFModel
import org.apache.spark.ml.Pipeline

val documentAssembler = new DocumentAssembler()
  .setInputCol("text")
  .setOutputCol("document")

val autoGGUFModel = AutoGGUFModel.pretrained("qwen3_4b_bf16_gguf", "en")
  .setInputCols("document")
  .setOutputCol("completions")
  .setBatchSize(4)
  .setNPredict(20)
  .setNGpuLayers(99)
  .setTemperature(0.4f)
  .setTopK(40)
  .setTopP(0.9f)
  .setPenalizeNl(true)

val pipeline = new Pipeline().setStages(Array(
  documentAssembler,
  autoGGUFModel
))

val data = Seq("Give me a short introduction to large language model.").toDF("text")

val model = pipeline.fit(data)
val result = model.transform(data)

result.select("completions").show(false)

Results

Large language models (LLMs) are advanced artificial intelligence systems designed to understand and generate human-like text. Trained on vast amounts of data, they can answer questions, write essays, code, create stories, and engage in conversations. These models use deep learning algorithms to recognize patterns in language, enabling them to produce coherent and contextually relevant responses. LLMs have revolutionized fields like customer service, content creation, and research, offering powerful tools for tasks ranging from translation to creative writing. While they are highly capable, their outputs depend on the quality of their training data and the specific instructions given.

Model Information

Model Name:	qwen3_4b_bf16_gguf
Compatibility:	Spark NLP 6.0.3+
License:	Open Source
Edition:	Official
Input Labels:	[document]
Output Labels:	[completions]
Language:	en
Size:	6.4 GB

PREVIOUSPhi-4-mini-Instruct GGUF (Q8_0 Quantized) by Microsoft

NEXTQwen3-4B GGUF (Q4_K_M Quantized) by Qwen