Click any tag below to further narrow down your results
Links
This article discusses the importance of monitoring the internal reasoning of AI models, rather than just their outputs. It outlines methods for evaluating how effectively this reasoning can be supervised, especially as models become more complex. The authors call for collaborative efforts to enhance the reliability of this monitoring as AI systems scale.
Bloom is an open source framework that automates the evaluation of AI model behaviors, allowing researchers to specify a desired behavior and generate relevant scenarios for assessment. The tool produces evaluations quickly and offers flexibility in measuring different behavioral traits, complementing existing tools like Petri.