Description
Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4 model family and supports 128K token context length. The model underwent an enhancement process, incorporating both supervised fine-tuning and direct preference optimization to support precise instruction adherence and robust safety measures.
Original model from https://huggingface.co/microsoft/Phi-4-mini-instruct
How to use
from sparknlp.base import DocumentAssembler
from sparknlp.annotator import AutoGGUFModel
from pyspark.ml import Pipeline
document_assembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")
auto_gguf_model = AutoGGUFModel.pretrained("phi_4_mini_instruct_q8_0_gguf", "en") \
.setInputCols(["document"]) \
.setOutputCol("completions") \
.setBatchSize(4) \
.setNPredict(20) \
.setNGpuLayers(99) \
.setTemperature(0.4) \
.setTopK(40) \
.setTopP(0.9) \
.setPenalizeNl(True)
pipeline = Pipeline().setStages([
document_assembler,
auto_gguf_model
])
data = spark.createDataFrame([
["The moon is "]
]).toDF("text")
model = pipeline.fit(data)
result = model.transform(data)
result.select("completions").show(truncate=False)
import com.johnsnowlabs.nlp.base.DocumentAssembler
import com.johnsnowlabs.nlp.annotators.auto.gguf.AutoGGUFModel
import org.apache.spark.ml.Pipeline
val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val autoGGUFModel = AutoGGUFModel.pretrained("phi_4_mini_instruct_q8_0_gguf", "en")
.setInputCols("document")
.setOutputCol("completions")
.setBatchSize(4)
.setNPredict(20)
.setNGpuLayers(99)
.setTemperature(0.4f)
.setTopK(40)
.setTopP(0.9f)
.setPenalizeNl(true)
val pipeline = new Pipeline().setStages(Array(
documentAssembler,
autoGGUFModel
))
val data = Seq("The moon is ").toDF("text")
val model = pipeline.fit(data)
val result = model.transform(data)
result.select("completions").show(false)
Results
The moon is Earth's only natural satellite and the fifth largest moon in the Solar System. It orbits,
Model Information
Model Name: | phi_4_mini_instruct_q8_0_gguf |
Compatibility: | Spark NLP 6.0.0+ |
License: | Open Source |
Edition: | Official |
Input Labels: | [document] |
Output Labels: | [completions] |
Language: | en |
Size: | 3.9 GB |