OpenAI ships GPT-5.5 as Nvidia deploys the model to 10,000 employees on GB200 architecture
The rollout of the Codex agent inside Nvidia reveals the inference economics of the new rack-scale systems, driving a 35x reduction in per-token costs.
OpenAI's release of GPT-5.5 arrives not as a standalone API announcement, but embedded within a massive enterprise deployment at its primary hardware partner. Nvidia has rolled out the model via OpenAI's Codex agent to 10,000 employees across its engineering and operations divisions. The deployment serves as a live demonstration of the GB200 NVL72 rack-scale architecture, proving that continuous, agentic inferenceThe process of running live data through a trained artificial intelligence model to generate an output or prediction. It is the operational phase that follows a model's initial training. at frontier scale has crossed the threshold of enterprise viability.
The structural shift is the coupling of agentic softwareSoftware designed to pursue open-ended goals by planning intermediate steps and executing them autonomously, rather than following a rigid set of pre-programmed rules. with next-generation silicon economics. To make autonomous coding agents practical, the underlying compute must support continuous, multi-step inferenceThe process of running live data through a trained artificial intelligence model to generate an output or prediction. It is the operational phase that follows a model's initial training. without breaking corporate budgets or power constraints. By serving GPT-5.5 on the GB200 NVL72, Nvidia claims a 35x lower cost per million tokens and a 50x increase in token output per second per megawatt compared to prior-generation systems. This efficiency curve allows agents to run continuously in secure, dedicated cloud virtual machines for individual developers, reducing multi-day debugging cycles to hours.
The scale of the underlying infrastructure points to a deepening codependence between the two companies. The GPT-5.5 model itself is the product of the first joint bring-up of a 100,000-GPU GB200 cluster, which completed its training runs with system-level reliability that sets a new baseline for frontier scale. Looking forward, OpenAI has committed to deploying more than 10 gigawatts of Nvidia systems for its next-generation infrastructure—a capital expenditure that locks the model builder into Nvidia's hardware roadmap for the remainder of the decade.
The immediate winners are enterprise IT departments, which gain a tested blueprint for deploying agentic workflows securely. Nvidia's internal implementation uses a zero-data retentionA security policy where a service provider processes a user's input to generate a response but immediately deletes the data, ensuring it is never stored or used for future model training. policy and restricts the Codex agent to approved cloud sandboxes via Secure Shell connections, granting read-only permissions through command-line interfaces. The losers are competing hardware accelerators attempting to break Nvidia's grip on the inferenceThe process of running live data through a trained artificial intelligence model to generate an output or prediction. It is the operational phase that follows a model's initial training. market; the tight co-design loop between OpenAI's frontier models and Nvidia's ecosystem frameworks creates a software moat that raw compute benchmarks cannot easily cross.
What this deployment forecloses is the assumption that frontier models remain too computationally expensive for persistent, agentic enterprise use. What it opens is a new baseline for corporate infrastructure, where every employee is allocated a dedicated compute sandbox running a continuous model. The bottleneck for enterprise AI adoption now shifts entirely from the raw cost of inferenceThe process of running live data through a trained artificial intelligence model to generate an output or prediction. It is the operational phase that follows a model's initial training. to the ability of organizations to securely map their internal data to agent-accessible interfaces.
