Quit Emailing Yourself

# machine-learning → large-language-models

5 links tagged with all of: machine-learning + large-language-models

Click any tag below to further narrow down your results

Links

Deep Think with Confidence

Deep Think with Confidence (DeepConf) is introduced as a method to improve reasoning efficiency and performance in large language models by using internal confidence signals to filter out low-quality reasoning traces. It requires no additional training or tuning and can be easily integrated into existing systems. Evaluations show significant accuracy improvements and a reduction in generated tokens on various reasoning tasks.

Saved by markshervey · Last saved January 12, 2026 · 1 min read

machine-learning ✓ large-language-models ✓ + efficiency + reasoning + deep-learning

GitHub - xhyumiracle/Awesome-AgenticLLM-RL-Papers

The repository serves as a comprehensive resource for the survey paper "The Landscape of Agentic Reinforcement Learning for LLMs: A Survey," detailing various reinforcement learning methods and their applications to large language models (LLMs). It includes tables summarizing methodologies, objectives, and key mechanisms, alongside links to relevant papers and resources in the field of AI.

Saved by tldr-importer · Last saved October 29, 2025 · 7 min read

+ reinforcement-learning large-language-models ✓ + agentic-llm + research-survey machine-learning ✓

JudgeLRM: Large Reasoning Models as a Judge

JudgeLRM introduces a novel approach to using Large Language Models (LLMs) as evaluators, particularly in complex reasoning tasks. By employing reinforcement learning with judge-wise rewards, JudgeLRM models significantly outperform traditional Supervised Fine-Tuning methods and current leading models, demonstrating superior performance in tasks that require deep reasoning.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

large-language-models ✓ + reasoning + reinforcement-learning + evaluation machine-learning ✓

[no-title]

The article discusses an automated workflow for tabular data validation using large language models (LLMs). It outlines the benefits of leveraging LLMs to enhance accuracy and efficiency in data validation processes, while also addressing challenges and potential strategies for implementation.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

+ data-validation + automated-workflow machine-learning ✓ large-language-models ✓ + tabular-data

GitHub - huawei-csl/SINQ: Welcome to the official repository of SINQ! A novel, fast and high-quality quantization method designed to make any Large Language Model smaller while preserving accuracy.

SINQ is a fast and model-agnostic quantization technique that enables the deployment of large language models on GPUs with limited memory while maintaining accuracy. It significantly reduces memory requirements and quantization time, offering improved model quality compared to existing methods. The technique introduces dual scaling to enhance quantization stability, allowing users to quantize models quickly and efficiently.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

+ quantization large-language-models ✓ + memory-optimization machine-learning ✓ + hugging-face