Phi-4-mini-Instruct GGUF (bfloat16 Quantized) by Microsoft

Description

Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4 model family and supports 128K token context length. The model underwent an enhancement process, incorporating both supervised fine-tuning and direct preference optimization to support precise instruction adherence and robust safety measures.

Original model from https://huggingface.co/microsoft/Phi-4-mini-instruct

Download Copy S3 URI

How to use

from sparknlp.base import DocumentAssembler
from sparknlp.annotator import AutoGGUFModel
from pyspark.ml import Pipeline

document_assembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

auto_gguf_model = AutoGGUFModel.pretrained("phi_4_mini_instruct_bf16_gguf", "en") \
    .setInputCols(["document"]) \
    .setOutputCol("completions") \
    .setBatchSize(4) \
    .setNPredict(-1) \
    .setNGpuLayers(99) \
    .setTemperature(0.5) \
    .setTopK(50) \
    .setTopP(0.9) \
    .setPenalizeNl(False)

pipeline = Pipeline().setStages([
    document_assembler,
    auto_gguf_model
])

data = spark.createDataFrame([
    ["The moon is "]
]).toDF("text")

model = pipeline.fit(data)
result = model.transform(data)

result.select("completions").show(truncate=False)

import com.johnsnowlabs.nlp.base.DocumentAssembler
import com.johnsnowlabs.nlp.annotators.auto.gguf.AutoGGUFModel
import org.apache.spark.ml.Pipeline

val documentAssembler = new DocumentAssembler()
  .setInputCol("text")
  .setOutputCol("document")

val autoGGUFModel = AutoGGUFModel.pretrained("phi_4_mini_instruct_bf16_gguf", "en")
  .setInputCols("document")
  .setOutputCol("completions")
  .setBatchSize(4)
  .setNPredict(-1)
  .setNGpuLayers(99)
  .setTemperature(0.5)
  .setTopK(50)
  .setTopP(0.9)
  .setPenalizeNl(False)

val pipeline = new Pipeline().setStages(Array(
  documentAssembler,
  autoGGUFModel
))

val data = Seq("The moon is ").toDF("text")

val model = pipeline.fit(data)
val result = model.transform(data)

result.select("completions").show(false)

Results

The main causes of climate change are attributed to human activities, particularly the emission of greenhouse gases (GHGs) such as carbon dioxide (CO2), methane (CH4), and nitrous oxide (N2O). These emissions result primarily from the burning of fossil fuels for electricity, heat, and transportation, deforestation, industrial processes, and some agricultural practices. The accumulation of these gases in the atmosphere leads to the greenhouse effect, where the Earth's surface is heated by the sun and then radiates heat back towards space. Greenhouse gases trap this heat, causing the planet's average temperature to rise, a phenomenon known as global warming. This warming leads to climate change, which manifests in various ways, including more frequent and severe weather events, rising sea levels, and disruptions to ecosystems and biodiversity. Reducing GHG emissions through renewable energy sources, energy efficiency, reforestation, and sustainable land use practices are crucial steps to mitigate the impacts of climate change.

Model Information

Model Name: phi_4_mini_instruct_bf16_gguf
Compatibility: Spark NLP 6.0.0+
License: Open Source
Edition: Official
Input Labels: [document]
Output Labels: [completions]
Language: en
Size: 6.1 GB