Click any tag below to further narrow down your results
Links
MolmoAct is an innovative Action Reasoning Model (ARM) developed to enhance spatial reasoning in robotics, allowing machines to understand and execute tasks in three-dimensional space. Built on the open-source Molmo framework, MolmoAct utilizes depth-aware perception tokens for improved action planning and execution, demonstrating superior performance and generalization capabilities in real-world scenarios. The model is fully open-source, promoting transparency and accessibility for further research and development in the field.
SmolVLA is a compact and open-source Vision-Language-Action model designed for robotics, capable of running on consumer hardware and trained on community-shared datasets. It significantly outperforms larger models in both simulation and real-world tasks, while offering faster response times through asynchronous inference. The model's lightweight architecture and efficient training methods aim to democratize access to advanced robotics capabilities.