1 link tagged with all of: reinforcement-learning + sampling
Click any tag below to further narrow down your results
Links
This article explores a new sampling algorithm for large language models (LLMs) that enhances reasoning capabilities without additional training. The authors demonstrate that their method can achieve single-shot reasoning performance comparable to reinforcement learning techniques while maintaining better diversity in outputs.