Quit Emailing Yourself

7 links tagged with all of: reasoning + machine-learning

Click any tag below to further narrow down your results

Links

R-Zero: Self-Evolving Reasoning LLM from Zero Data

R-Zero is a self-evolving framework for Large Language Models (LLMs) that generates its own training data autonomously, circumventing reliance on human-curated tasks. It features two models—the Challenger, which poses increasingly difficult tasks, and the Solver, which solves them—allowing for co-evolution and significant improvements in reasoning capabilities across various benchmarks. Empirical results show notable enhancements in performance, particularly with the Qwen3-4B-Base model.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

machine-learning ✓ + self-evolving reasoning ✓ + autonomous-learning + llm

Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities

Daily-Omni is introduced as a new benchmark for audio-visual reasoning, featuring 684 videos and 1197 QA pairs across various tasks. The study highlights the challenges faced by current multimodal large language models in integrating audio and visual information, while demonstrating that combining visual and audio models with temporal alignment techniques can enhance performance. The paper also presents a QA generation pipeline to improve efficiency and scalability in evaluation.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

+ audio-visual reasoning ✓ + multimodal machine-learning ✓ + benchmark

SOCIAL MEDIA TITLE TAG

Robix is a unified model that integrates robot reasoning, task planning, and natural language interaction, enhancing human-robot collaboration through a hierarchical system. It employs innovative capabilities such as proactive dialogue and context-aware reasoning, achieving superior performance in interactive task execution across various user-involved scenarios. Extensive evaluations show that Robix outperforms leading models in both foundational and interactive capabilities.

Saved by tldr-importer · Last saved October 29, 2025 · 5 min read

+ robot-interaction + task-planning + natural-language machine-learning ✓ reasoning ✓

Improving Reasoning Performance in Large Language Models via Representation Engineering

Recent advancements in large language models (LLMs) have prompted discussions about their reasoning capabilities. This study introduces a representation engineering approach that leverages model activations to create control vectors, enhancing reasoning performance on various tasks without additional training. The results indicate that modulating model activations can effectively improve LLMs' reasoning abilities.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

reasoning ✓ + language-models + representation-engineering + control-vectors machine-learning ✓

M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

M1 introduces a hybrid linear RNN reasoning model based on the Mamba architecture, designed for scalable test-time computation in solving complex mathematical problems. By leveraging distillation from existing models and reinforcement learning, M1 achieves significant speed and accuracy improvements over traditional transformer models, matching the performance of state-of-the-art distilled reasoning models while utilizing memory-efficient inference techniques.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

machine-learning ✓ reasoning ✓ + inference + scalability + benchmarks

JudgeLRM: Large Reasoning Models as a Judge

JudgeLRM introduces a novel approach to using Large Language Models (LLMs) as evaluators, particularly in complex reasoning tasks. By employing reinforcement learning with judge-wise rewards, JudgeLRM models significantly outperform traditional Supervised Fine-Tuning methods and current leading models, demonstrating superior performance in tasks that require deep reasoning.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

+ large-language-models reasoning ✓ + reinforcement-learning + evaluation machine-learning ✓

GitHub - MetaStone-AI/XBai-o4

XBai o4 is the latest fourth-generation open-source large model technology, showcasing enhanced complex reasoning capabilities that surpass OpenAI-o3-mini in Medium mode. It employs a novel reflective generative training form to significantly reduce inference costs and improve response quality. The repository includes training and evaluation code, along with instructions for setup and benchmarks.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

+ xbai + open-source reasoning ✓ machine-learning ✓ + benchmarks