GPT-5.4 Mini and Nano Are Built for the Age of AI Agents

Noah Ogbi3 min readUpdated Jun 1, 2026

Tips, corrections, or questions? support@omniscient.media

TopicsAgents Frontier Models

CompaniesOpenAI

Consequential AI, explained and evaluated, every weekday.

The Omniscient Bulletin: 5 to 7 items a day with the take, not the recap.

Performance at a Fraction of the Cost

GPT-5.4 mini runs more than twice as fast as GPT-5 mini and closes much of the gap with the flagship model on key benchmarks.^[2] On SWE-Bench Pro, a test measuring a model's ability to resolve real GitHub issues, mini scores 54.4%, compared to 45.7% for GPT-5 mini and 57.7% for GPT-5.4 itself.^[2] On OSWorld-Verified, which assesses desktop computer use by reading screenshots, mini reaches 72.1%, just below the human baseline of 72.4% and just short of GPT-5.4's 75.0%.^[2]

GPT-5.4 nano occupies the lowest tier: 52.4% on SWE-Bench Pro and 39.0% on OSWorld, meaningfully below mini but a substantial leap over previous nano-class models.^[2] It is API-only at launch, which signals OpenAI's intent clearly; nano is a developer primitive, not a consumer interface.

Pricing reflects the tiering. GPT-5.4 mini costs $0.75 per million input tokens and $4.50 per million output tokens. Nano is $0.20 input and $1.25 output, roughly four times cheaper on inputs than mini and more than twelve times cheaper than the full GPT-5.4 at $2.50/$15.00.^[2]

The Subagent Architecture

What makes this launch consequential is less the individual model specs than the architectural pattern they enable. OpenAI explicitly positions mini and nano as subagent models: systems where a large reasoning model (GPT-5.4 Thinking, for instance) plans and coordinates while smaller models execute discrete tasks in parallel.^[1] Searching a codebase, reading a file, processing a form, interpreting a screenshot: these are jobs where latency matters and where burning GPT-5.4 quota is economically irrational.

Within Codex, this is already operational. GPT-5.4 mini uses only 30% of the GPT-5.4 quota and can be delegated to by Codex for less reasoning-intensive work.^[2] Aabhas Sharma, CTO of AI research and analysis platform Hebbia, reported that mini "matched or exceeded competitive models on several output tasks and citation recall at a much lower cost" and achieved higher end-to-end pass rates and stronger source attribution than the full GPT-5.4 model on their evaluations.^[2]

Access and Availability

GPT-5.4 mini is available now in the API, in Codex, and in ChatGPT for Free and Go tier users via the "Thinking" option. For paid subscribers, it serves as the automatic rate-limit fallback for GPT-5.4 Thinking. Nano is API-only.^[2]

The cadence of OpenAI's model releases this quarter (GPT-5.3 Instant, GPT-5.4 Thinking, GPT-5.4, GPT-5.4 mini, GPT-5.4 nano) reflects a deliberate effort to tile the cost-performance spectrum at every level. The strategy mirrors how cloud computing matured: dominant players won not just by having the best flagship instance type, but by offering the right size at the right price for every conceivable workload. The frontier model is the attention-getter. The nano is where the margin lives.

Sources

Model Release Review

GPT-5.4 Mini and Nano Are Built for the Age of AI Agents

AI Research

Gemini 3.1 Pro Reviewed: Google's Reasoning Reversal

Performance at a Fraction of the Cost

The Subagent Architecture

Access and Availability

Sources