Quit Emailing Yourself

# multimodal → reinforcement-learning → spatio-temporal

1 link tagged with all of: multimodal + reinforcement-learning + spatio-temporal

GitHub - OpenGVLab/VideoChat-R1: [NIPS2025] VideoChat-R1 & R1.5: Enhancing Spatio-Temporal Perception and Reasoning via Reinforcement Fine-Tuning

The VideoChat-R1.5 model has been released on Huggingface, showcasing improved capabilities in spatio-temporal perception and reasoning through multi-task joint reinforcement learning. It has been accepted at NIPS2025 and builds on previous versions, enhancing video reasoning across various applications. The model utilizes hierarchical human attention during inference for better localization of regions of interest in videos.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

+ video-chat reinforcement-learning ✓ spatio-temporal ✓ multimodal ✓ + nips2025

Links

GitHub - OpenGVLab/VideoChat-R1: [NIPS2025] VideoChat-R1 & R1.5: Enhancing Spatio-Temporal Perception and Reasoning via Reinforcement Fine-Tuning