"Extreme Co-Design — Philosophy"
01 — Extreme Co-Design
“In the last 10 years, Moore’s Law would have progressed computing about 100 times. We progressed and scaled up computing by a million times. And so we’re going to keep on doing that — through extreme co-design.” — Jensen Huang, Lex Fridman Podcast #494, 2025
What It Is
Extreme Co-Design is the philosophy of designing every layer of a system simultaneously, treating the system as one optimization problem rather than a sequence of independent layers.
The word extreme is deliberate. It signals integration far beyond typical “co-design” (coordinating a chip team with a firmware team). For Nvidia, extreme co-design means the GPU die, CPU, memory subsystem, NVLink interconnect fabric, networking ASICs, power delivery, cooling, rack mechanical design, operating software stack, inference runtime, and even the specific AI algorithms are all treated as a single engineering problem co-optimized together.
The Quantitative Evidence
- Moore’s Law alone: ~100x compute improvement over 10 years
- Nvidia’s actual delivery: ~1,000,000x over the same period
- The 10,000x delta comes entirely from co-design, not from better components
This is the primary argument for why co-design is necessary, not merely good practice. Per-component gains are linear. Cross-surface gains are multiplicative.
Core Principles
1. Full-Stack Simultaneous Optimization
All layers are designed in the same engineering cycle. Cross-layer dependencies are surfaced and resolved before commitment, not after.
2. The Rack Is the Unit of Compute
Not the chip. Not the server. The rack. 72 GPUs connected via NVLink with no GPU-to-GPU direct cable within a server — all equidistant through the Switch fabric. The system is one distributed accelerator. You cannot use a component independently — it is designed to be a node in an integrated system.
Translated: The domain is the unit of delivery, not the service.
3. Moore’s Law Substitution
When per-component gains plateau, the only path to continued exponential improvement is co-optimizing how components interact. The shared surface becomes the primary site of engineering investment.
4. Algorithmic Co-Design
The methodology extends to the consumers of the system. Mixture of Experts models are a co-designed outcome — hardware and software teams worked with model designers to ensure MoE routing maps efficiently onto NVLink-connected multi-GPU domains. The algorithm is co-designed with the infrastructure, not adapted to it after the fact.
5. Constraints as First-Class Design Inputs
Every component is co-designed to operate within system-level constraints (thermal envelope, power delivery, bandwidth). Constraints are not afterthoughts; they are traded against capability from day one.
6. Kernel-Surface Co-Optimization
At the execution layer, runtime code and hardware capabilities are developed in direct coordination — not after the substrate ships. The compiler toolchain is aware of hardware capabilities before commitment.
Product Evidence
GB200 NVL72 (Blackwell)
36 Grace CPUs + 72 Blackwell GPUs in a liquid-cooled rack, 130 TB/s aggregate NVLink5 bandwidth.
- Delivers 30x faster trillion-parameter LLM inference vs. prior generation
- System has only ~2x the transistors
- The 15x delta is entirely from co-design
Vera Rubin Platform
Six co-designed chips. Despite doubling NVLink bandwidth, cable and connector count did not change — a mechanical co-design achievement.
NVLink Fusion
Extends the co-design surface to third-party components. The fabric is the shared surface, and third parties join it by accepting the co-design constraints. The point: the surface is not proprietary — it is the coordination mechanism.
Sarvam AI Inference (2026)
Most granular published example:
- Kernel + scheduling optimizations → 2x
- Quantization scheme co-designed with hardware → additional 2x
- Total: 4x — explicitly described as “extreme hardware-software co-design”
The Management Dimension
Jensen Huang holds no one-on-one meetings. All sessions are cross-functional groups where problems are solved in the presence of everyone affected by the decision. This mirrors the engineering philosophy organizationally: a surface decision cannot be made by one side alone.
Summary
Extreme Co-Design is not primarily a hardware concept. It is a design philosophy about the relationship between layers. The insight: performance gains are not additive across independently optimized layers — they are multiplicative when layers are co-optimized across their shared surfaces.
The hardware is the domain where Nvidia proved it at scale. The principle generalizes to any system where components interact through defined surfaces.