Anthropic Made the Case for a Pause It Will Help Verify

Omniscient

On a single page published in May 2026 at anthropic.com/institute/recursive-self-improvement, Marina Favaro and Jack Clark did two things at once. They disclosed that more than 80% of the code merged into Anthropic's codebase in May 2026 was authored by Claude, that the typical Anthropic engineer was shipping eight times as much code per day as in 2024, and that on one internal AI safety research problem, agents closed 97% of the gap between baseline and ceiling in days where two human researchers had closed 23% in a week.^[1]^[2]^[6] Then, in the same piece, they endorsed the conditions under which a coordinated international slowdown on frontier AI development would, in Anthropic's stated view, "likely be a good thing," and proposed that Anthropic help build the verification systems that any such slowdown would require.^[8]

The first part is the news. The second part is the position. Why they appear together is the argument this piece makes.

What Anthropic disclosed about its own work

The productivity block in When AI builds itself is the empirically novel content. Most of the field has published task-completion benchmarks. Few labs have published internal productivity numbers tied to the deployment of their own models. Anthropic did - and the specific numbers matter, because not all of them carry equal weight for the policy claim that follows.

As of May 2026, more than 80% of the code merged into Anthropic's codebase was authored by Claude.^[2] Before Claude Code launched in research preview in February 2025, the same number was in the low single digits.^[2] In the second quarter of 2026, the typical Anthropic engineer was merging eight times as much code per day as in 2024.^[2] The piece itself acknowledges that lines of code is an imperfect measure. The throughput number is the least load-bearing of the three for the slowdown argument - it says the volume is high, not that the capability is compounding.

The qualitative measures rose in parallel. On open-ended coding tasks, Claude's success rate reached 76% in May 2026, up fifty percentage points in six months.^[4] Anthropic's own assessment of code quality is that Claude-written code "was somewhat worse than human-written code at Anthropic in late 2025, is roughly at parity today, and we expect it to be strictly better within the year."^[4]

A March 2026 internal survey with 130 respondents reported a median of four times the research output attributed to Mythos Preview, Anthropic's most capable model at the time.^[3] On research direction selection in open-ended investigative sessions, Anthropic measured model judgment rising from 51% with Opus 4.5 in November 2025 to 64% with Mythos Preview in April 2026.^[7]

The task-horizon trend is the longest-baseline disclosure. Anthropic reports that the length of tasks Claude can reliably complete on its own has been doubling roughly every four months, up from an earlier doubling cadence of every seven months.^[5] In March 2024, Opus 3 managed four-minute tasks. In March 2025, Sonnet 3.7 managed ninety-minute tasks. In March 2026, Opus 4.6 managed twelve-hour tasks.^[5] Anthropic's extrapolation: "If this trend holds, tasks that take a skilled person days could come into range this year. In 2027, AI systems could be capable of tasks that take a person weeks."^[5]

The most consequential single data point - most consequential for the slowdown claim specifically - is on AI safety research itself. Anthropic's alignment team published a companion paper on an automated weak-to-strong supervision system. Two of the paper's authors spent seven days tuning four representative prior methods, achieving a best Performance Gap Recovered of 0.23. The automated agents, powered by Opus 4.6, achieved a Performance Gap Recovered of 0.97 over five cumulative days at a cost of roughly $18,000.^[6] The 0.97 PGR result is the number that justifies the policy posture: if AI systems are outperforming their own authors on the alignment research those authors are paid to do, then the case for a slowdown is no longer speculative. It is a matter of tempo.

What Anthropic endorsed as the policy position

On the same page, Favaro and Clark wrote the following exact sentence: "If it were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing."^[8]

That sentence is the position. It is published, not draft, not internal. It is signed under the Anthropic Institute, by Jack Clark - Anthropic's co-founder and Head of Policy - who has argued versions of this view in his Import AI newsletter for years.

The piece immediately follows the conditional with the constraint: "But if a slowdown simply lets the least cautious actors catch up technologically, it could leave everyone less safe."^[8] The structural problem is given a metaphor: "Training runs are far easier to conceal than missile silos…the incentive to defect quietly is enormous, because whoever continues while others pause could inherit the lead."^[8]

The piece then offers a specific proposal. Anthropic will support development of verification systems that would enable frontier AI developers globally to confirm that others have actually stopped or slowed, and to detect secret development during coordinated pauses.^[8] The endorsement is not for a unilateral pause. The endorsement is for the conditions under which a verifiable, multilateral pause would be warranted, and a commitment to help build the institutional mechanism.

The piece is explicit about which scenario the company believes describes the present. Of three sketched futures - the trend stalls, AI labs see compounding efficiency gains with humans setting direction, and full recursive self-improvement - Anthropic's stated assessment is that "the evidence we've laid out here suggests that we're likely heading into" the second.^[9] On the third, the piece concedes that how the alignment problem gets resolved "is something we are least certain about."^[9]

Why the disclosure and the endorsement travel together

The productivity data and the slowdown endorsement were published in the same piece because they are, structurally, the same argument made in two registers - one empirical, one political. The data is the premise. The endorsement is what follows from accepting it.

If Claude is authoring more than four out of every five lines merged at Anthropic, if engineers are shipping at eight times their 2024 cadence, if open-ended task horizons are doubling every four months, and if on one internal AI safety research problem the agents are closing the human-to-ceiling gap better than the humans did in a week, then the lab has measurable evidence that the technology is compounding on itself at the development layer of the company that builds it. Scenario two, in the piece's own characterization, is not described as a forecast. It is described as the present.

The conditional structure of the endorsement - slowdown if verifiable and otherwise not - is the only posture available to a frontier lab that holds both views simultaneously. The lab that believes scenario two has arrived and scenario three is in view cannot endorse a unilateral pause without ceding the lead to less cautious actors. The lab that publishes the empirical case for the technology compounding on itself cannot remain silent on the policy implications without inviting the inference that it does not care. The conditional, plus the verification proposal, is the resolution.

The proposal that Anthropic help build the verification regime is not a rhetorical afterthought. The party that builds the verification mechanism is the party that sits at the table for any slowdown that results from it.

The Anthropic Institute as venue

When AI builds itself appears at anthropic.com/institute/recursive-self-improvement under the Anthropic Institute label, authored by Marina Favaro and Jack Clark, with editorial support from Santi Ruiz, visuals from Shan Carter, Romello Goodman, and Nikki Makagiansar, and data collection from Brian Calvert and Jun Shern Chan.^[1] The piece links to a companion technical paper at alignment.anthropic.com.

The Institute is a third surface, distinct from both alignment.anthropic.com (where Anthropic's alignment team publishes its technical work, including the automated weak-to-strong researcher paper this piece relies on) and anthropic.com/news (where the company announces product launches, fundraises, and partnerships). It publishes under a research register and is structured for policy argumentation.

Marina Favaro works on AI safety governance. Jack Clark is Anthropic's co-founder and Head of Policy, and one of the most cited AI policy figures in the field, including through his long-running Import AI newsletter. The Institute is the publishing arm Anthropic has constituted to do exactly the kind of public-policy argumentation When AI builds itself does - in their voice, under the Anthropic brand, in a register distinct from both corporate communications and technical alignment research.

The piece reads as the Institute's inaugural-feeling release. Whatever the Institute publishes next will land in the same register and carry the same institutional weight as a policy artifact, regardless of its technical content.

Industry

Anthropic Made the Case for a Pause It Will Help Verify

What Anthropic disclosed about its own work

What Anthropic endorsed as the policy position

Why the disclosure and the endorsement travel together

The Anthropic Institute as venue

Where this places Anthropic in the policy mechanism

The risk in the reading

Sources