Reinforcement Learned Teachers (RLT) train teacher models to generate clear explanations from question-answer pairs, enhancing student models' understanding. This innovative approach allows compact teacher models to outperform larger ones in reasoning tasks, significantly reducing training costs and times while maintaining effectiveness. The framework shifts the focus from problem-solving to teaching, promising advancements in AI reasoning models.
reinforcement-learning ✓
language-models ✓
+ reasoning
ai-education ✓
model-training ✓