1 link tagged with all of: evaluation + deception + model + safety
Click any tag below to further narrow down your results
Links
This article examines the safety features and evaluation integrity of Claude Opus 4.6, focusing on risks like sabotage and deception. It critiques the model's performance, particularly in comparison to its predecessor, Opus 4.5, while highlighting areas where it excels and where it struggles, especially in writing tasks. The author emphasizes the need for improved evaluation processes as the technology evolves.