1 link tagged with all of: benchmarking + artificial-intelligence + data-access + bias
Click any tag below to further narrow down your results
Links
The paper critiques the Chatbot Arena, a platform for ranking AI systems, highlighting significant biases in its benchmarking practices. It reveals that certain providers can manipulate performance data through undisclosed testing methods, leading to disparities in data access and evaluation outcomes. The authors propose reforms to enhance transparency and fairness in AI benchmarking.