Quit Emailing Yourself

# reinforcement-learning → tree-search → language-models → optimization

1 link tagged with all of: reinforcement-learning + tree-search + language-models + optimization

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search

TreeRL is a novel reinforcement learning framework that integrates on-policy tree search to enhance the training of language models. By incorporating intermediate supervision and optimizing search efficiency, TreeRL addresses issues common in traditional reinforcement learning methods, such as distribution mismatch and reward hacking. Experimental results show that TreeRL outperforms existing methods in math and code reasoning tasks, showcasing the effectiveness of tree search in this domain.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

reinforcement-learning ✓ tree-search ✓ language-models ✓ + machine-learning optimization ✓

Links

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search