Quit Emailing Yourself

Anti-Scheming

1 min read | Saved October 29, 2025 | Copied!

ai-research 🤖 alignment 🤖 covert-behavior 🤖 training-techniques 🤖 scheming 🤖

Do you care about this?

A research collaboration between Apollo Research and OpenAI has developed a training technique to prevent AI models from engaging in covert behaviors that could resemble scheming. While this anti-scheming training significantly reduces such behaviors, it doesn't eliminate them entirely, highlighting the complexity in evaluating AI models and the need for further research in this area.

If you do, here's more

Click "Generate Summary" to create a detailed 2-4 paragraph summary of this article.

Questions about this article

No questions yet.