Quit Emailing Yourself

# reinforcement-learning → large-language-models

5 links tagged with all of: reinforcement-learning + large-language-models

Click any tag below to further narrow down your results

Links

GitHub - xhyumiracle/Awesome-AgenticLLM-RL-Papers

The repository serves as a comprehensive resource for the survey paper "The Landscape of Agentic Reinforcement Learning for LLMs: A Survey," detailing various reinforcement learning methods and their applications to large language models (LLMs). It includes tables summarizing methodologies, objectives, and key mechanisms, alongside links to relevant papers and resources in the field of AI.

Saved by tldr-importer · Last saved October 29, 2025 · 7 min read

reinforcement-learning ✓ large-language-models ✓ + agentic-llm + research-survey + machine-learning

Inference-Time Scaling for Generalist Reward Modeling

The paper explores the enhancement of reward modeling in reinforcement learning for large language models, focusing on inference-time scalability. It introduces Self-Principled Critique Tuning (SPCT) to improve generative reward modeling and proposes a meta reward model to optimize performance during inference. Empirical results demonstrate that SPCT significantly enhances the quality and scalability of reward models compared to existing methods.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

reinforcement-learning ✓ + reward-modeling large-language-models ✓ + inference-scaling + generative-models

Open Source RL Libraries for LLMs | Anyscale

Reinforcement learning (RL) is becoming essential in developing large language models (LLMs), particularly for aligning them with human preferences and enhancing their capabilities through multi-turn interactions. This article reviews various open-source RL libraries, analyzing their designs and trade-offs to assist researchers in selecting the appropriate tools for specific applications. Key libraries discussed include TRL, Verl, OpenRLHF, and several others, each catering to different RL needs and architectures.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

reinforcement-learning ✓ + open-source + libraries large-language-models ✓ + agentic-rl

JudgeLRM: Large Reasoning Models as a Judge

JudgeLRM introduces a novel approach to using Large Language Models (LLMs) as evaluators, particularly in complex reasoning tasks. By employing reinforcement learning with judge-wise rewards, JudgeLRM models significantly outperform traditional Supervised Fine-Tuning methods and current leading models, demonstrating superior performance in tasks that require deep reasoning.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

large-language-models ✓ + reasoning reinforcement-learning ✓ + evaluation + machine-learning

The Art of Scaling Reinforcement Learning Compute for LLMs

Reinforcement learning (RL) is essential for training large language models (LLMs), but there is a lack of effective scaling methodologies in this area. This study presents a framework for analyzing RL scaling, demonstrating through extensive experimentation that certain design choices can optimize compute efficiency while maintaining performance. The authors propose a best-practice recipe, ScaleRL, which successfully predicts validation performance using a significant compute budget.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

reinforcement-learning ✓ large-language-models ✓ + scaling-methodologies + compute-efficiency + best-practices