Quit Emailing Yourself

# reasoning → language-models

7 links tagged with all of: reasoning + language-models

Click any tag below to further narrow down your results

Links

GitHub - QwenLM/ParScale: Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling

A new scaling paradigm for language models, called Parallel Scaling (ParScale), is introduced, emphasizing parallel computation during training and inference. This approach demonstrates significant benefits, including improved reasoning performance, greater inference efficiency, and reduced memory and latency costs compared to traditional parameter scaling. The authors provide various models and tools to facilitate implementation and experimentation with this new scaling law.

Saved by tldr-importer · Last saved October 29, 2025 · 3 min read

+ parallel-scaling language-models ✓ reasoning ✓ + inference-efficiency + cost-analysis

Sakana AI

Reinforcement Learned Teachers (RLT) train teacher models to generate clear explanations from question-answer pairs, enhancing student models' understanding. This innovative approach allows compact teacher models to outperform larger ones in reasoning tasks, significantly reducing training costs and times while maintaining effectiveness. The framework shifts the focus from problem-solving to teaching, promising advancements in AI reasoning models.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

+ reinforcement-learning language-models ✓ reasoning ✓ + ai-education + model-training

Improving Reasoning Performance in Large Language Models via Representation Engineering

Recent advancements in large language models (LLMs) have prompted discussions about their reasoning capabilities. This study introduces a representation engineering approach that leverages model activations to create control vectors, enhancing reasoning performance on various tasks without additional training. The results indicate that modulating model activations can effectively improve LLMs' reasoning abilities.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

reasoning ✓ language-models ✓ + representation-engineering + control-vectors + machine-learning

GitHub - martianlantern/ThinkMesh: Parallel thinking for LLMs. Confidence‑gated, strategy‑driven, offline‑friendly

ThinkMesh is a Python library designed for executing various reasoning strategies in parallel using language models, particularly leveraging the Qwen2.5-7B-Instruct model. It supports multiple reasoning approaches such as DeepConf, Self-Consistency, and Debate, catering to a range of problem types from mathematical proofs to planning tasks. The library also includes performance monitoring and benchmarking features to ensure effective usage and integration with different backends.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

+ thinkmesh language-models ✓ reasoning ✓ + python + benchmarks

REverse-Engineered Reasoning for Open-Ended Generation

REverse-Engineered Reasoning (REER) introduces a novel approach to instilling deep reasoning in language models by working backwards from known solutions to discover the underlying reasoning process. This method addresses the limitations of traditional reinforcement learning and instruction distillation, resulting in the creation of a large dataset, DeepWriting-20K, and a model, DeepWriter-8B, that outperforms existing models in open-ended tasks. The research emphasizes the importance of structured reasoning and iterative refinement in generating high-quality outputs.

Saved by tldr-importer · Last saved October 29, 2025 · 3 min read

+ deep-learning reasoning ✓ language-models ✓ + dataset + open-ended-generation

Reinforcement Learning on Pre-Training Data

Reinforcement Learning on Pre-Training Data (RLPT) introduces a new paradigm for scaling large language models (LLMs) by allowing the policy to autonomously explore meaningful trajectories from pre-training data without relying on human annotations for rewards. By adopting a next-segment reasoning objective, RLPT improves LLM capabilities, as demonstrated by significant performance gains on various reasoning benchmarks and encouraging broader context exploration for enhanced generalization.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

+ reinforcement-learning + pre-training language-models ✓ + scaling reasoning ✓

Chain of Draft: Thinking Faster by Writing Less

The paper introduces the Chain of Draft (CoD) paradigm, which enables Large Language Models (LLMs) to generate concise intermediate reasoning outputs, mimicking human draft strategies. By focusing on essential information and reducing verbosity, CoD achieves comparable or superior accuracy to Chain-of-Thought prompting while utilizing significantly fewer tokens, thus lowering costs and latency in reasoning tasks.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

+ chain-of-thought language-models ✓ reasoning ✓ + efficiency + minimalism