Quit Emailing Yourself

It's hard to make scheming evals look realistic for LLMs

Creating realistic scheming evaluations for LLMs proves difficult, as models like Claude 3.7 Sonnet can easily recognize evaluation contexts. Attempts to enhance realism through prompt modifications have yielded limited success, suggesting a need for a fundamental rethink of evaluation structures. The issue of evaluation awareness could present significant challenges for future LLM assessments.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

+ llm-evaluation + scheming realism ✓ + alignment + artificial-intelligence

Links

It's hard to make scheming evals look realistic for LLMs