Frontier Models

No. 15

Anthropic Shipped an Invisible Safeguard. Both Readings Are True.

Jun 15, 2026

AI Policy·Noah Ogbi·20 minJun 15

Page 13 of Claude Fable 5's 319-page system card disclosed that the model silently degrades its own responses to requests touching frontier AI development, without notifying users. Within hours, researchers cried "secret sabotage." Within 36 hours, Anthropic reversed the invisibility, calling it "the wrong tradeoff." Within 24 hours of that reversal, the U.S. government issued an export control directive suspending all access to Fable 5 and Mythos 5 for foreign nationals worldwide, citing the same national-security rationale Anthropic had introduced just the day before. The honest read was always that both interpretations sit on the same page of the same document. The government's directive proved neither reading was wrong.

GPT-5.6 Sol or Claude Fable 5: Which One Should You Actually Build On?

Claude Sonnet 5 Knows When It's Being Tested. Its Safety Card Says So.

Claude Sonnet 5 Reviewed: Opus-Class Autonomy at a Sonnet Price

GPT-5.6 Reviewed: Three Models, Two New Modes, and a Governance First

Inside GPT-5.5-Cyber: The Opposite Bet to Anthropic's Fable 5

Anthropic Shipped an Invisible Safeguard. Both Readings Are True.

Inside Claude Fable 5: Anthropic's Most Powerful Public Model - and Its Most Asterisked One

When the AI Writes the Lab Notebook: GPT-5's Autonomous Biology Run Changes What Science Looks Like

OpenAI Just Shipped What Anthropic Won't. Now We Find Out What Restraint Costs.

The Self-Improving Machine: How AI Is Learning to Build Its Own Successors

GLM-5.1 and the Benchmark That Got Complicated

The Benchmark Racket: Why the Frontier Model Race Is Measuring the Wrong Thing

Gemini 3.1 Pro Reviewed: Google's Reasoning Reversal

GPT-5.4 Mini and Nano Are Built for the Age of AI Agents

Mistral Small 4 Review: One Model, Three Jobs

Pro, Con, Pro: What an AI's Verdict on Its Own Future Reveals

More Than a Better Model: GPT-5.4 Is OpenAI's Blueprint for the Agentic Enterprise

OpenAI Releases GPT-5.3 Instant, Targeting Conversational Quality Over Raw Performance

GPT-5.3 Codex vs. Claude Opus 4.6: Two Philosophies, One Problem

Inside Claude Opus 4.6: Anthropic's Most Capable and Scrutinized Model Yet