1mo ago·San Francisco·2 min read

OpenAI ships GPT-5.5 as Nvidia deploys the model to 10,000 employees on GB200 architecture

The rollout of the Codex agent inside Nvidia reveals the inference economics of the new rack-scale systems, driving a 35x reduction in per-token costs.

By A. Hollis Verne · filed from San Francisco

OpenAI's release of GPT-5.5 arrives not as a standalone API announcement, but embedded within a massive enterprise deployment at its primary hardware partner. Nvidia has rolled out the model via OpenAI's Codex agent to 10,000 employees across its engineering and operations divisions. The deployment serves as a live demonstration of the GB200 NVL72 rack-scale architecture, proving that continuous, agentic inference at frontier scale has crossed the threshold of enterprise viability.

The structural shift is the coupling of agentic software with next-generation silicon economics. To make autonomous coding agents practical, the underlying compute must support continuous, multi-step inference without breaking corporate budgets or power constraints. By serving GPT-5.5 on the GB200 NVL72, Nvidia claims a 35x lower cost per million tokens and a 50x increase in token output per second per megawatt compared to prior-generation systems. This efficiency curve allows agents to run continuously in secure, dedicated cloud virtual machines for individual developers, reducing multi-day debugging cycles to hours.

The scale of the underlying infrastructure points to a deepening codependence between the two companies. The GPT-5.5 model itself is the product of the first joint bring-up of a 100,000-GPU GB200 cluster, which completed its training runs with system-level reliability that sets a new baseline for frontier scale. Looking forward, OpenAI has committed to deploying more than 10 gigawatts of Nvidia systems for its next-generation infrastructure—a capital expenditure that locks the model builder into Nvidia's hardware roadmap for the remainder of the decade.

The immediate winners are enterprise IT departments, which gain a tested blueprint for deploying agentic workflows securely. Nvidia's internal implementation uses a zero-data retention policy and restricts the Codex agent to approved cloud sandboxes via Secure Shell connections, granting read-only permissions through command-line interfaces. The losers are competing hardware accelerators attempting to break Nvidia's grip on the inference market; the tight co-design loop between OpenAI's frontier models and Nvidia's ecosystem frameworks creates a software moat that raw compute benchmarks cannot easily cross.

What this deployment forecloses is the assumption that frontier models remain too computationally expensive for persistent, agentic enterprise use. What it opens is a new baseline for corporate infrastructure, where every employee is allocated a dedicated compute sandbox running a continuous model. The bottleneck for enterprise AI adoption now shifts entirely from the raw cost of inference to the ability of organizations to securely map their internal data to agent-accessible interfaces.

Sources (1)

https://blogs.nvidia.com/blog/openai-codex-gpt-5-5-ai-agents/

filed by A. Hollis Verne · drawn from 1 source · April 23, 2026

Calibrate this dispatchtotal · 0 / 25

Drag along each spoke — center is 0, edge is 5