Anthropic has published a detailed sabotage risk report for Claude Opus 4.6 - its first under the new RSP v3.0 Risk Report framework - concluding the model poses "very low but not negligible" risk of autonomous actions that could contribute to catastrophic outcomes. The document is notable both for what it finds and for the candor with which it describes the limits of its own methods.
Anthropic's Claude Opus 4.6 system card documents sweeping capability gains alongside safety findings that are harder to dismiss than those of any previous generation. On cyber evaluations the model has hit a ceiling, on autonomous R&D it is approaching one, and the tools used to monitor it are struggling to keep pace.