1 link tagged with all of: performance + general-capability
Click any tag below to further narrow down your results
Links
This article analyzes how benchmark scores for AI models often reflect a single dimension of "general capability." It discusses the implications of this finding, particularly the contrasting ideas of whether model performance is based on a deep underlying ability or if it is contingent on specific skills. The author also introduces the concept of "Claudiness," which reveals limitations in certain model capabilities.