Quit Emailing Yourself

# multimodal → language-model

2 links tagged with all of: multimodal + language-model

Links

Junfeng5/Liquid_V1_7B · Hugging Face

Liquid is an innovative auto-regressive model that integrates visual comprehension and generation by tokenizing images into discrete codes and learning them alongside text tokens. This multimodal large language model operates within a shared feature space, allowing for seamless understanding and generation without relying on external visual embeddings. Liquid is available in multiple sizes and explores the scaling laws of multimodal models, revealing mutual benefits between understanding and generation tasks.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

multimodal ✓ language-model ✓ + image-generation + tokenization + deep-learning

GitHub - yannqi/R-4B: The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"

R-4B is a multimodal large language model that enhances general-purpose auto-thinking by dynamically switching between thinking and non-thinking modes based on task complexity. It employs a two-stage training approach to improve response efficiency and reduce computational costs, achieving state-of-the-art performance among similar models. The model is open-source and offers user control over its thinking capabilities.

Saved by tldr-importer · Last saved October 29, 2025 · 3 min read

multimodal ✓ language-model ✓ + auto-thinking + open-source + inference