Omniscient
AllBulletinArticlesReviewsCommentaryFeatured
Sign In

Omniscient

AI intelligence briefings, analysis, and commentary — delivered in broadsheet form.

By Noah Ogbi

Subscribe

Weekday briefings and flagship analysis, delivered to your inbox.

Sections

  • All
  • Bulletin
  • Articles
  • Reviews
  • Commentary

Topics

  • Industry Strategy
  • Anthropic
  • AI Policy
  • Compute Economics
  • Research
  • OpenAI
  • Frontier Models
  • Agents

Meta

  • About
  • Masthead
  • Standards
  • Corrections
  • RSS Feed
  • Privacy Policy
  • Terms of Service

Omniscient Media — made by ForeverBuilt, LLC.
© 2026 ForeverBuilt, LLC. All rights reserved.

  1. Home
  2. ›Industry
  3. ›OpenAI Is No Longer Just a Software Company

Industry

Vol. 1·Wednesday, June 24, 2026

OpenAI Is No Longer Just a Software Company

The Jalapeño chip is not just a product announcement. It is the final piece of a deliberate strategy to control every layer of the AI stack.


Noah Ogbi10 min read

Tips, corrections, or questions? support@omniscient.media

TopicsIndustry StrategyCompute Economics
CompaniesOpenAINVIDIA
OpenAI Is No Longer Just a Software Company
Share:

Consequential AI, explained and evaluated, every weekday.

The Omniscient Bulletin: 5 to 7 items a day with the take, not the recap.


Related

AI Research

Vol. 1·Tuesday, March 17, 2026

From Seven Chips to One Trillion Dollars: NVIDIA's Vera Rubin Redraws the AI Infrastructure Map


From Seven Chips to One Trillion Dollars: NVIDIA's Vera Rubin Redraws the AI Infrastructure Map

NVIDIA's GTC 2026 keynote unveiled a trillion-dollar order outlook, the Vera Rubin platform, Dynamo 1.0 as an inference operating system, and a landmark Meta partnership; together they make the case that the future of agentic AI runs on a single, vertically integrated stack.


Compute EconomicsNVIDIA
Noah Ogbi12 min read
Continue →

AI Research

Vol. 1·Monday, March 9, 2026

NVIDIA's Vera Rubin Is the Most Consequential Hardware Announcement in a Decade

NVIDIA's Vera Rubin platform, announced at CES 2026 and entering production this year, promises 10x lower inference token costs and 5x per-GPU compute over Blackwell. This is not an incremental upgrade. It will fundamentally reshape who can afford to build frontier AI.


Compute Economics
NVIDIA
Noah Ogbi6 min read
Continue →

Industry

Vol. 1·Thursday, March 12, 2026

GTC 2026: NVIDIA Is No Longer Just a Chip Company


GTC 2026: NVIDIA Is No Longer Just a Chip Company

Jensen Huang's GTC 2026 keynote crystallizes an ambition that has been building for years: NVIDIA wants to own the entire AI infrastructure stack, from silicon to software to agents. Three headline announcements - the Rubin GPU architecture, a Groq-derived inference system, and the NemoClaw enterprise agent platform - make the case in full.


NVIDIAIndustry StrategyCompute Economics
Noah Ogbi7 min read
Continue →

For three years, OpenAI built its products on hardware it could not get fast enough. On Wednesday morning, it announced its answer: Jalapeño, its first custom AI accelerator, taped out in nine months with Broadcom and already running production workloads in the lab. The chip marks the moment OpenAI stopped being a company that buys infrastructure and became one that builds it.

Since the launch of ChatGPT in late 2022, OpenAI has been among the most voracious customers of Nvidia's GPUs, spending billions annually on the chips that power its models and serve its users. That dependence was not merely financial. It was structural: a frontier AI lab whose product roadmap was, in part, determined by what compute Nvidia chose to make available and when. Greg Brockman acknowledged the constraint plainly on CNBC Wednesday. OpenAI, he said, "cannot get compute fast enough."[1] Jalapeño is the company's most direct answer to that problem yet. The ceremonial detail - Broadcom CEO Hock Tan and Semiconductor Solutions President Charlie Kawwas hand-delivering the physical sample to Brockman and Sam Altman - was fitting.

What Jalapeño actually is

Jalapeño is an ASIC - an application-specific integrated circuit - designed from a blank slate for large language model inference. Unlike a GPU, which is a general-purpose parallel processor adapted for AI workloads, an ASIC is purpose-built for a specific task. The trade-off is well understood in semiconductor circles: you give up flexibility in exchange for efficiency. A chip that does one thing and does it at the hardware's theoretical limits will almost always outperform one that does everything reasonably well.

According to the joint press release, early testing shows Jalapeño delivering performance per watt "substantially better than current state-of-the-art," though OpenAI has not yet published hard benchmark numbers - those are promised in a detailed technical report "in the coming months."[2] What is already running in the lab is meaningful: engineering samples are executing ML workloads at production target frequency and power, including GPT-5.3-Codex-Spark. The chip has not merely taped out; it is functional at spec.

The architecture, as described by OpenAI hardware program lead Richard Ho, was built around the specific bottlenecks that matter at frontier scale: data movement, memory bandwidth, and network fabric. These are not abstract concerns - at inference scale, GPU clusters routinely achieve only a fraction of their theoretical peak throughput because the processor sits idle waiting for data to arrive. The goal was to close that gap by designing the memory and networking assumptions into the silicon from the start, rather than bolting workarounds onto a general-purpose architecture. Broadcom's Tomahawk networking silicon handles the connectivity layer, while Celestica manages board, rack, and system integration for production deployment.[2]

The nine-month development story

The timeline is the part that should unsettle competitors most. Industry benchmarks for high-performance ASIC development run to two or three years from initial design to tape-out. OpenAI and Broadcom completed the same process in nine months - a compression that both companies attribute to OpenAI's own models accelerating the design cycle.[1] That claim has not been independently verified; the technical report is still forthcoming. But it is the companies' stated account, and the tape-out timeline itself is not in dispute.

The implication is recursive and significant: the same models being served to users helped design the hardware that will serve future models faster. OpenAI's press release frames this explicitly - "the same models served to users are helping improve the infrastructure used to run future models."[2] If that feedback loop holds, the pace of hardware iteration at OpenAI could accelerate in ways that are difficult for competitors to match, particularly those that rely on third-party silicon with longer development cycles and less model-specific tuning.

For the broader semiconductor industry, a nine-month ASIC-to-tape-out cycle at this performance class - if it holds up under independent scrutiny - would be the first production-scale evidence for a thesis that has been building quietly: that AI-assisted chip design can meaningfully compress development timelines. It is one data point, self-reported. But it is a data point no one had before Wednesday.

Where it fits in a larger structure

Consider what OpenAI has assembled in the past 18 months. In January 2025, it announced the Stargate Project, a $500 billion infrastructure commitment with SoftBank, Oracle, and Microsoft, targeting a network of data centers across the United States.[3] In May 2025, it acquired io Products, the hardware company co-founded by former Apple design chief Jony Ive, in an all-stock transaction valued at approximately $6.5 billion, bringing in a team built to develop the company's first consumer device - confirmed by OpenAI's chief global affairs officer to be on track for the second half of 2026.[4] In October 2025, it signed a multi-year chip supply agreement with AMD covering up to 6 gigawatts of Instinct GPUs, and in January 2026 closed a deal with Cerebras worth over $10 billion for 750 megawatts of wafer-scale inference capacity.[6] And now Jalapeño, with a deployment roadmap running from small prototypes in late 2026 to a multi-generation infrastructure program at gigawatt scale.[2]

The pattern is unmistakable. OpenAI is building vertically, layer by layer, in the same direction Apple once traveled: from software, to platform, to silicon, to device. The end state the company appears to be targeting is one in which it controls the model, the chip the model runs on, the data center the chip lives in, and the device in the consumer's hand.

The strategic upside of that structure is plain: a company that controls its own compute is insulated from the supply constraints and pricing leverage that define the current GPU market. Brockman's public framing centers on cost and access - "By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access"[2] - and that is a genuine part of the motivation. Custom silicon at gigawatt scale will materially reduce inference costs per token, which expands the economics of deploying AI in products that require high volume and low latency. The supply-chain independence is the structural bet underneath it.

What this means for Nvidia

The competitive framing here requires precision. Jalapeño is an inference chip. It does not, at least in its current form, challenge Nvidia's dominance in the training market, where Blackwell-generation accelerators remain the standard for frontier model development. Nvidia itself has signaled that it sees inference as its next major growth vector: at GTC 2026 in March, CEO Jensen Huang projected at least $1 trillion in revenue from its newest AI chips through 2027, with inference increasingly at the center of that thesis.[5]

That is precisely where the pressure lands. Inference is where OpenAI's costs concentrate - every ChatGPT query, every API call, every agentic workflow runs on inference compute. A custom ASIC that delivers substantially better performance per watt for those workloads represents real, recurring substitution of Nvidia revenue at scale. OpenAI is not the only hyperscaler building custom inference silicon: Google has its TPUs, Amazon has Trainium and Inferentia, and Microsoft announced its second-generation Maia 200 inference accelerator in January 2026. But OpenAI's arrival in that cohort is notable precisely because the company lacks Google's and Amazon's legacy infrastructure divisions. It has built this capability from scratch, under commercial pressure, in under two years.

Hock Tan's comment to CNBC is worth holding: demand from Broadcom's six ASIC customers is "simply insatiable," and he sees "even elevated demand in '28."[1] That is Broadcom's business growing regardless of who is buying Nvidia. The more interesting read is what it says about the scale of OpenAI's ambitions: a multi-gigawatt compute program is not a hedge against GPU scarcity. It is the infrastructure plan of a company that expects to be running a significant fraction of the world's AI workloads.

The risk in the reading

The full-stack thesis is compelling in structure, but several of its pieces remain unproven. Jalapeño's performance claims rest on early lab testing; the technical report has not been published. A chip that runs efficiently on GPT-5.3-Codex-Spark in controlled conditions may behave differently across the full range of OpenAI's production workloads, and the transition from small prototype deployment to gigawatt-scale production in 18 months is an execution risk that Tan himself flagged by breaking the timeline into stages. Stargate's financing and permitting have faced documented friction, and the io consumer device remains in development without a confirmed specification. OpenAI is assembling a stack that no pure-software AI lab has attempted before - which means there is no established playbook for what comes next, and no prior example to validate the timeline.

What is not in doubt is the intent. Three years ago, OpenAI needed Nvidia to build ChatGPT. By 2028, if the roadmap holds, it will run ChatGPT on its own chips, in its own data centers, on a device it designed itself. Whether that constitutes a genuine transformation or an ambitious overreach will depend on execution at each layer. But the direction - and the velocity - are no longer ambiguous.


Sources

  1. CNBC, June 24, 2026: "OpenAI and Broadcom reveal Jalapeño, first AI chip in partnership" - Brockman: "cannot get compute fast enough"; Tan on "insatiable" demand through 2028; nine-month design cycle attributed to OpenAI models Inline ↗

  2. Broadcom / OpenAI joint press release, June 24, 2026: Jalapeño specifications, architecture, nine-month tape-out, deployment roadmap, Brockman and Ho quotes, Celestica board/rack/system integration Inline ↗

  3. OpenAI: "Announcing The Stargate Project," January 21, 2025 - $500 billion infrastructure commitment; SoftBank, OpenAI, Oracle, and MGX as initial equity funders; Microsoft as a technology partner Inline ↗

  4. OpenAI: "A letter from Sam & Jony," May 2025 - io Products acquisition, Jony Ive assuming design responsibilities, consumer hardware ambitions Inline ↗

  5. Reuters, March 16, 2026: "Nvidia bets on AI inference as chip revenue opportunity hits $1 trillion" - Jensen Huang at GTC 2026 projecting at least $1 trillion in chip revenue through 2027 Inline ↗

  6. CNBC, October 6, 2025: OpenAI-AMD multi-year chip supply deal, up to 6 gigawatts of Instinct GPUs; and CNBC, January 14, 2026: Cerebras-OpenAI deal worth over $10 billion for 750 megawatts of compute Inline ↗