Large language models derive from decades of accessible text, but their data consumption outpaces human production, leading to a need for self-generated experiences in AI. The article discusses the importance of exploration in reinforcement learning and how better exploration can enhance generalization in models, highlighting the role of pretraining in solving exploration challenges. It emphasizes that the future of AI progress will focus more on collecting the right experiences rather than merely increasing model capacity.
A new method for estimating the memorization capacity of language models is proposed, distinguishing between unintended memorization and generalization. The study finds that GPT-style models have an estimated capacity of 3.6 bits per parameter, revealing that models memorize data until their capacity is reached, after which generalization begins to take precedence.