1mo ago·Beijing·2 min read

DeepSeek ships V4 as the open-weight model bypasses Nvidia for Huawei silicon

The 1-million-token model matches the performance of GPT-5.4 and Claude Opus 4.6 while natively optimizing for China's domestic Ascend 950 supernodes.

By A. Hollis Verne · filed from Beijing

DeepSeek has released V4, an open-weight frontier model that matches the performance of the latest Western closed systems while explicitly bypassing Nvidia's hardware ecosystem. The release, split into Pro and Flash variants, establishes that US export controls designed to cap Chinese artificial intelligence capabilities have instead accelerated the development of a fully domestic, highly optimized alternative stack.

The model's efficiency gains stem from a fundamental rewrite of its attention mechanism. Rather than treating all previous text equally in long prompts—the primary bottleneck for extended context—V4 compresses older information while retaining high-fidelity recall for immediate, nearby tokens. This selective attention allows both V4 variants to process a 1-million-token context window while using only 27 percent of the compute and 10 percent of the memory required by its predecessor, V3.2. For the smaller V4-Flash, memory use drops to 7 percent.

This architectural efficiency translates directly into aggressive commercial pricing. DeepSeek is charging $1.74 per million input tokens and $3.48 per million output tokens for V4-Pro. The V4-Flash variant drops to $0.14 and $0.28 respectively. Despite the cost, V4-Pro matches Anthropic’s Claude Opus 4.6, OpenAI’s GPT-5.4, and Google’s Gemini 3.1 on standard benchmarks. Crucially, DeepSeek withheld pre-release access from American chipmakers like Nvidia and AMD, choosing instead to optimize the model natively for Huawei’s Ascend 950 series supernodes.

The immediate winners are developers building agentic workflows, who can now run extensive codebase analysis and multi-step reasoning tasks without hitting the prohibitive cost ceilings of Western APIs. Huawei and the broader Chinese silicon sector also win, having secured a flagship software asset that proves their domestic hardware can support state-of-the-art training and inference. The losers are the major Western frontier labs whose revenue models rely on sustained API margins, and the architects of the chip embargoes who assumed hardware constraints would enforce a permanent capability gap.

V4 forecloses the assumption that open-weight models must inherently lag a generation behind proprietary systems, or that cutting-edge AI requires an uninterrupted supply of American GPUs. What the release opens is a bifurcated global artificial intelligence architecture: one where the underlying hardware substrate diverges sharply across the Pacific, even as the resulting cognitive capabilities achieve strict parity.

Sources (1)

https://www.technologyreview.com/2026/04/24/1136422/why-deepseeks-v4-matters/

filed by A. Hollis Verne · drawn from 1 source · April 24, 2026

Calibrate this dispatchtotal · 0 / 25

Drag along each spoke — center is 0, edge is 5