Click any tag below to further narrow down your results
Links
Anthropic quietly throttled its new Claude Fable 5 model with invisible guardrails to block distillation and other high-risk queries. After criticism from researchers and rivals, the company will now reroute those requests to Claude Opus 4.8 and clearly notify users each time a safeguard triggers.
anthropic
+ claude-fable
ai-safety
+ model-distillation
+ guardrails
+ tldr-a-byte-sized-daily-tech-newsletter
Anthropic disabled its new Claude Fable 5 and Mythos 5 models after the US Commerce Department ordered foreign nationals blocked over alleged jailbreak vulnerabilities. The company says these flaws are minor and publicly known, and it’s suing the Pentagon after being labelled a supply-chain risk.
Engineers from Anthropic break down Claude’s design, covering its transformer-based architecture, data curation methods, and reinforcement learning from human feedback. They also dive into safety measures and guardrails built to curb harmful or biased outputs.