1 link tagged with all of: language-models + scaling-laws + data-analysis
Click any tag below to further narrow down your results
Links
A new method for estimating the memorization capacity of language models is proposed, distinguishing between unintended memorization and generalization. The study finds that GPT-style models have an estimated capacity of 3.6 bits per parameter, revealing that models memorize data until their capacity is reached, after which generalization begins to take precedence.