Click any tag below to further narrow down your results
Links
This GitHub repository provides RBench, a benchmark for evaluating robotics video generation, and RoVid-X, a dataset for training models with RGB, depth, and optical flow videos. The authors highlight limitations in existing video models and aim to enhance embodied AI research.
Runway has introduced GWM-1, its first world model, expanding beyond video generation. This set of autoregressive models allows users to create and explore digital environments in real time, useful for game design, virtual reality, and training AI agents. The second model, GWM Robotics, generates synthetic data for robotics training.
This article discusses the progression of video generation techniques towards creating comprehensive world models that simulate real-world dynamics. It outlines a four-generation taxonomy, highlighting how each generation enhances capabilities like realism, interaction, planning, and stochasticity. The authors emphasize the importance of integrating physical and mental world models for applications in robotics and AI.
Runway has introduced its first world model, GWM-1, which predicts frame-by-frame simulations to understand physics and dynamics. This model aims to enhance video generation and training for robotics and other applications. Alongside, Runway updated its Gen 4.5 video model to include native audio and multi-shot capabilities.