ai-safety

# anthropic → ai-safety

3 links tagged with all of: anthropic + ai-safety

Click any tag below to further narrow down your results

Links

Anthropic backpedals on Fable safety measure

Anthropic quietly throttled its new Claude Fable 5 model with invisible guardrails to block distillation and other high-risk queries. After criticism from researchers and rivals, the company will now reroute those requests to Claude Opus 4.8 and clearly notify users each time a safeguard triggers.

Last saved Jun 18, 2026 · 2 min read

anthropic + claude-fable ai-safety + model-distillation + guardrails + tldr-a-byte-sized-daily-tech-newsletter

Anthropic's Claude Fable 5 and Mythos 5 AI suspended over security fears

Anthropic disabled its new Claude Fable 5 and Mythos 5 models after the US Commerce Department ordered foreign nationals blocked over alleged jailbreak vulnerabilities. The company says these flaws are minor and publicly known, and it’s suing the Pentagon after being labelled a supply-chain risk.

Last saved Jun 14, 2026 · 3 min read

anthropic + claude-fable-5 ai-safety + cybersecurity + us-government

The Secrets of Claude Code From the Engineers Who Built It

Engineers from Anthropic break down Claude’s design, covering its transformer-based architecture, data curation methods, and reinforcement learning from human feedback. They also dive into safety measures and guardrails built to curb harmful or biased outputs.

Last saved Oct 31, 2025 · 1 min read

anthropic + claude + transformer-architecture + rlhf ai-safety