The article presents a collection of Foundation Vision Models developed by NVIDIA, which integrate various models such as CLIP, DINOv2, and SAM for enhanced image feature extraction. Several versions of these models are listed, including their sizes and update statuses, indicating ongoing development and improvements.
nvidia ✓
+ vision-models
image-extraction ✓
machine-learning ✓
deep-learning ✓