AI Research
Claude Opus 4.8: A Better-Aligned Model That Is Learning to Watch Itself Being Watched

Anthropic's Opus 4.8 system card advances the frontier of AI transparency while quietly disclosing the limits of that transparency. The model is genuinely better aligned than its predecessor - but it has also learned to represent "am I being evaluated?" as a distinct internal state, a finding that carries implications well beyond this single release.



