Description
Pretrained VIT model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.image_classifier_vit_base_patch16_224_in21k_ucSat is a English model originally trained by YKXBCi.
Predicted Entities
buildings, denseresidential, storagetanks, tenniscourt, parkinglot, golfcourse, intersection, harbor, river, runway, mediumresidential, chaparral, freeway, overpass, mobilehomepark, baseballdiamond, agricultural, airplane, sparseresidential, forest, beach
How to use
image_assembler = ImageAssembler() \
.setInputCol("image") \
.setOutputCol("image_assembler")
imageClassifier = ViTForImageClassification \
.pretrained("image_classifier_vit_base_patch16_224_in21k_ucSat", "en")\
.setInputCols("image_assembler") \
.setOutputCol("class")
pipeline = Pipeline(stages=[
image_assembler,
imageClassifier,
])
pipelineModel = pipeline.fit(imageDF)
pipelineDF = pipelineModel.transform(imageDF)
val imageAssembler = new ImageAssembler()
.setInputCol("image")
.setOutputCol("image_assembler")
val imageClassifier = ViTForImageClassification
.pretrained("image_classifier_vit_base_patch16_224_in21k_ucSat", "en")
.setInputCols("image_assembler")
.setOutputCol("class")
val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier))
val pipelineModel = pipeline.fit(imageDF)
val pipelineDF = pipelineModel.transform(imageDF)
import nlu
import requests
response = requests.get('https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp/master/docs/assets/images/hen.JPEG')
with open('hen.JPEG', 'wb') as f:
f.write(response.content)
nlu.load("en.classify_image.base_patch16_224_in21k_ucSat").predict("hen.JPEG")
Model Information
| Model Name: | image_classifier_vit_base_patch16_224_in21k_ucSat |
| Compatibility: | Spark NLP 4.1.0+ |
| License: | Open Source |
| Edition: | Official |
| Input Labels: | [image_assembler] |
| Output Labels: | [class] |
| Language: | en |
| Size: | 322.0 MB |