Click any tag below to further narrow down your results
Links
LMArena, a startup that tracks AI model performance, recently raised $150 million, bringing its valuation to $1.7 billion. The platform, which began as a research project at UC Berkeley, allows users to evaluate and compare AI models through a public leaderboard. It has quickly become a key player in an industry needing independent assessments.
Arabic Leaderboards has launched a new platform to centralize evaluations of Arabic AI models, featuring updates to the AraGen benchmark and the introduction of the Arabic Instruction Following leaderboard. The AraGen-03-25 release includes expanded datasets and improvements in evaluation methodologies, emphasizing the need for accurate assessments in Arabic language tasks. Ongoing analysis of ranking consistency among models highlights the robust nature of the evaluation framework amidst dynamic updates.