2 links tagged with all of: training + reinforcement-learning
Click any tag below to further narrow down your results
Links
This article describes Endless Terminals, a system that automatically creates terminal-based tasks for training reinforcement learning agents without needing human input. It details the setup process, task generation, and evaluation steps using specific Python scripts and configurations. The framework supports various models for enhanced training efficiency.
The SGLang RL team developed an end-to-end INT4 Quantization-Aware Training (QAT) pipeline that enhances training efficiency and model stability. By using fake quantization during training and real quantization at inference, they achieved significant performance improvements for large models on a single GPU. The article details the technical steps taken and results of their approach.