llama_3_2_11b_vision_instruct_int4 model

Description

Pretrained CoHereTransformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.llama_3_2_11b_vision_instruct_int4 is a english model originally trained by Qwen.

Download Copy S3 URI

How to use


image_assembler = ImageAssembler().setInputCol("image").setOutputCol("image_assembler")


seq2seq = MLLamaForMultimodal.pretrained("llama_3_2_11b_vision_instruct_int4","en") \
      .setInputCols(["image_assembler"]) \
      .setOutputCol("generation")

pipeline = Pipeline().setStages([image_assembler, seq2seq])



val image_assembler = new ImageAssembler().setInputCol("image").setOutputCol("image_assembler")

val seq2seq = MLLamaForMultimodal.pretrained("llama_3_2_11b_vision_instruct_int4","en")
    .setInputCols(Array("image_assembler"))
    .setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(image_assembler, seq2seq))


Model Information

Model Name: llama_3_2_11b_vision_instruct_int4
Compatibility: Spark NLP 5.5.1+
License: Open Source
Edition: Official
Input Labels: [image_assembler]
Output Labels: [answer]
Language: en
Size: 6.4 GB