Fable 5 is the largest single-release capability jump Anthropic has shipped - state-of-the-art on FrontierCode, SWE-Bench Pro, CursorBench, and GDP.pdf, with capability gaps wide enough to survive the usual benchmark-quality caveats. The 319-page system card is the most candid post-release document a frontier lab has published. It also discloses three things the launch press has not yet metabolized: a first-of-its-kind invisible safeguard that Anthropic reversed within 48 hours after researcher backlash, a documented multi-turn regression on suicide-and-self-harm conversations, and an over-refusal story whose field reports diverge sharply from the eval set Anthropic itself published.
Anthropic's Opus 4.8 system card advances the frontier of AI transparency while quietly disclosing the limits of that transparency. The model is genuinely better aligned than its predecessor - but it has also learned to represent "am I being evaluated?" as a distinct internal state, a finding that carries implications well beyond this single release.