Quit Emailing Yourself

# reinforcement-learning → memory-optimization

1 link tagged with all of: reinforcement-learning + memory-optimization

Click any tag below to further narrow down your results

Links

🐯 Liger GRPO meets TRL

Liger enhances TRL’s Group Relative Policy Optimization (GRPO) by reducing memory consumption by 40% during training without sacrificing model quality. The integration also introduces support for Fully Sharded Data Parallel (FSDP) and Parameter-Efficient Fine-Tuning (PEFT), facilitating scalable training across multiple GPUs. Additionally, Liger Loss can be paired with vLLM for accelerated text generation during training.

Saved by tldr-importer · Last saved October 29, 2025 · 5 min read

+ liger + grpo memory-optimization ✓ reinforcement-learning ✓ + fine-tuning