UniVLA presents a novel approach to generalist policy planning using an embodiment-agnostic action space, achieving state-of-the-art results across various benchmarks with efficient training. It includes a comprehensive methodology for extracting latent actions from cross-embodiment videos and guidance on pre-training and fine-tuning models for real-world robot tasks.
robotics ✓
machine-learning ✓
action-planning ✓
+ pre-training
video-analysis ✓