Description
Pretrained CoHereTransformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.llama_3_2_11b_vision_instruct_int4
is a english model originally trained by Qwen.
How to use
image_assembler = ImageAssembler().setInputCol("image").setOutputCol("image_assembler")
seq2seq = MLLamaForMultimodal.pretrained("llama_3_2_11b_vision_instruct_int4","en") \
.setInputCols(["image_assembler"]) \
.setOutputCol("generation")
pipeline = Pipeline().setStages([image_assembler, seq2seq])
val image_assembler = new ImageAssembler().setInputCol("image").setOutputCol("image_assembler")
val seq2seq = MLLamaForMultimodal.pretrained("llama_3_2_11b_vision_instruct_int4","en")
.setInputCols(Array("image_assembler"))
.setOutputCol("embeddings")
val pipeline = new Pipeline().setStages(Array(image_assembler, seq2seq))
Model Information
Model Name: | llama_3_2_11b_vision_instruct_int4 |
Compatibility: | Spark NLP 5.5.1+ |
License: | Open Source |
Edition: | Official |
Input Labels: | [image_assembler] |
Output Labels: | [answer] |
Language: | en |
Size: | 6.4 GB |