Description
LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture.
Predicted Entities
How to use
import sparknlp
from sparknlp.base import *
from sparknlp.annotator import *
from pyspark.ml import Pipeline
from pyspark.sql.functions import lit
image_df = spark.read.format("image").load(path=images_path) # Replace with your image path
test_df = image_df.withColumn("text", lit("USER: \n <|image|> \n What's this picture about? \n ASSISTANT:\n"))
imageAssembler = ImageAssembler()
.setInputCol("image")
.setOutputCol("image_assembler")
visualQAClassifier = LLAVAForMultiModal.pretrained()
.setInputCols("image_assembler")
.setOutputCol("answer")
pipeline = Pipeline().setStages([
imageAssembler,
visualQAClassifier
])
result = pipeline.fit(test_df).transform(test_df)
result.select("image_assembler.origin", "answer.result").show(False)
import spark.implicits._
import com.johnsnowlabs.nlp.base._
import com.johnsnowlabs.nlp.annotator._
import org.apache.spark.ml.Pipeline
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.functions.lit
val imageFolder = "path/to/your/images" // Replace with your image path
val imageDF: DataFrame = spark.read
.format("image")
.option("dropInvalid", value = true)
.load(imageFolder)
val testDF: DataFrame = imageDF.withColumn("text", lit("USER: \n <|image|> \nWhat is unusual on this picture? \n ASSISTANT:\n"))
val imageAssembler: ImageAssembler = new ImageAssembler()
.setInputCol("image")
.setOutputCol("image_assembler")
val visualQAClassifier = LLAVAForMultiModal.pretrained()
.setInputCols("image_assembler")
.setOutputCol("answer")
val pipeline = new Pipeline().setStages(Array(
imageAssembler,
visualQAClassifier
))
val result = pipeline.fit(testDF).transform(testDF)
result.select("image_assembler.origin", "answer.result").show(false)
Model Information
Model Name: | llava_1_5_7b_hf |
Compatibility: | Spark NLP 5.5.1+ |
License: | Open Source |
Edition: | Official |
Input Labels: | [image_assembler] |
Output Labels: | [answer] |
Language: | en |
Size: | 3.9 GB |
References
https://huggingface.co/llava-hf/llava-1.5-7b-hf
PREVIOUSOLMo 1B