Quit Emailing Yourself

# multimodal → open-source → reasoning

2 links tagged with all of: multimodal + open-source + reasoning

Click any tag below to further narrow down your results

Links

GitHub - MoonshotAI/Kimi-VL: Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities

Kimi-VL is an open-source Mixture-of-Experts vision-language model that excels in multimodal reasoning and long-context understanding with only 2.8B activated parameters. It demonstrates superior performance in various tasks such as multi-turn interactions, video comprehension, and mathematical reasoning, competing effectively with larger models while maintaining efficiency. The latest variant, Kimi-VL-A3B-Thinking-2506, enhances reasoning and visual perception capabilities, achieving state-of-the-art results in several benchmarks.

Saved by tldr-importer · Last saved October 29, 2025 · 5 min read

+ vision-language multimodal ✓ reasoning ✓ open-source ✓ + model

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

InternVL3.5 introduces a new family of open-source multimodal models that enhance versatility, reasoning capabilities, and inference efficiency. A key innovation is the Cascade Reinforcement Learning framework, which improves reasoning tasks significantly while a Visual Resolution Router optimizes visual token resolution. The model achieves notable performance gains and supports advanced capabilities like GUI interaction and embodied agency, positioning it competitively against leading commercial models.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

multimodal ✓ reasoning ✓ + reinforcement-learning open-source ✓ + computer-vision