StreamBridge is a framework designed to convert offline Video Large Language Models (Video-LLMs) into proactive streaming assistants, addressing issues of multi-turn understanding and proactive response mechanisms. It utilizes a memory buffer and a lightweight activation model for continuous engagement, alongside the creation of the Stream-IT dataset for enhanced streaming video comprehension. Experiments demonstrate that StreamBridge outperforms existing models, showcasing significant improvements in video understanding tasks.
video-llms ✓
proactive-assistant ✓
+ streaming
machine-learning ✓
dataset ✓