Reinforcement Learning (RL) has emerged as a new training paradigm for AI models, but it is significantly less information-efficient compared to traditional pre-training methods. This shift poses challenges, as RL requires much longer sequences of tokens to glean minimal information, potentially hindering progress in developing advanced AI capabilities. The article emphasizes the implications of this inefficiency for future AI scaling and performance.