2 links tagged with all of: multimodal + open-source + language-model
Click any tag below to further narrow down your results
Links
The GLM-4.6V series introduces two open-source multimodal models, designed for both high-performance cloud use and local deployment. It features a 128k token context window and native tool calling, enabling seamless integration of visual and textual inputs for tasks like content creation and web search.
R-4B is a multimodal large language model that enhances general-purpose auto-thinking by dynamically switching between thinking and non-thinking modes based on task complexity. It employs a two-stage training approach to improve response efficiency and reduce computational costs, achieving state-of-the-art performance among similar models. The model is open-source and offers user control over its thinking capabilities.