Quit Emailing Yourself

# inference → vision-language

2 links tagged with all of: inference + vision-language

Links

GitHub - facebookresearch/zero: PyTorch Implementation of Zero-Shot Vision Encoder Grafting via LLM Surrogates [ICCV'25]

The article provides an overview of a codebase for training language and vision-language models using PyTorch, highlighting installation instructions, model inference, and training setup. It details the required dependencies, configuration paths, and methods for integrating new datasets and models, while also addressing the usage of various GPU resources for efficient training and evaluation.

Saved by tldr-importer · Last saved October 29, 2025 · 5 min read

+ pytorch vision-language ✓ + model-training inference ✓ + evaluation

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

SmolVLA is a compact and open-source Vision-Language-Action model designed for robotics, capable of running on consumer hardware and trained on community-shared datasets. It significantly outperforms larger models in both simulation and real-world tasks, while offering faster response times through asynchronous inference. The model's lightweight architecture and efficient training methods aim to democratize access to advanced robotics capabilities.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

+ smolvla + robotics vision-language ✓ + open-source inference ✓