NVIDIA HGX Platform

Accelerating advanced AI in every data center.

Overview
Inference
Training
Networking
Specifications

Overview
Inference
Training
Networking
Specifications

Overview

Unmatched End-to-End Accelerated Computing Platform

The NVIDIA HGX™ platform brings together the full power of NVIDIA GPUs, NVIDIA NVLink™, NVIDIA networking, and fully optimized AI and high-performance computing (HPC) software stacks to provide the highest application performance and drive the fastest time to insights for every data center.

The NVIDIA HGX B300 integrates eight NVIDIA Blackwell Ultra GPUs with high-speed interconnects, delivering 1.5x more dense FP4 Tensor Core FLOPS and 2x attention performance versus HGX B200 to propel the data center into a new era of accelerated computing and generative AI. As a premier accelerated scale-up platform with up to 30x more AI Factory output than the previous generation, NVIDIA Blackwell Ultra-based HGX systems are designed for the most demanding generative AI, data analytics, and HPC workloads.

NVIDIA Blackwell Ultra Datasheet

NVIDIA Blackwell Ultra is designed for massive-scale AI reasoning inference, delivering smarter, faster, and more efficient AI. Learn about the specifications and performance of NVIDIA HGX B300 and GB300 NVL72.

View Datasheet

NVIDIA HGX H100 and HGX H200 Datasheet

Discover the capabilities and features of NVIDIA's HGX H100 and H200 systems. This datasheet provides detailed information on specifications and performance.

View Datasheet

Purpose-Built for AI and High-Performance Computing

AI, complex simulations, and massive datasets require multiple GPUs with extremely fast interconnections and a fully accelerated software stack. The NVIDIA HGX™ platform brings together the full power of NVIDIA GPUs, NVIDIA NVLink™, NVIDIA networking, and fully optimized AI and high-performance computing (HPC) software stacks to provide the highest application performance and drive the fastest time to insights for every data center.

Unmatched End-to-End Accelerated Computing Platform

The NVIDIA HGX B300 integrates NVIDIA Blackwell Ultra GPUs with high-speed interconnects to propel the data center into a new era of accelerated computing and generative AI. As a premier accelerated scale-up platform with up to 11x more inference performance than the previous generation, NVIDIA Blackwell-based HGX systems are designed for the most demanding generative AI, data analytics, and HPC workloads.

NVIDIA HGX includes advanced networking options—at speeds up to 800 gigabits per second (Gb/s)—using NVIDIA Quantum-X800 InfiniBand and Spectrum™-X Ethernet for the highest AI performance. HGX also includes NVIDIA BlueField®-3 data processing units (DPUs) to enable cloud networking, composable storage, zero-trust security, and GPU compute elasticity in hyperscale AI clouds.

AI Reasoning Performance and Versatility

DeepSeek-R1 ISL = 32K, OSL = 8K, HGX B300 with FP4 Dynamo disaggregation. H100 with FP8 In-flight batching. Projected performance subject to change.

Boost Revenue With HGX B300 AI Factory Output

The frontier curve illustrates key parameters that determine AI factory token revenue output. The vertical axis represents GPU tokens per second (TPS) throughput in one megawatt (MW) AI factory, while the horizontal axis quantifies user interactivity and responsiveness as TPS for a single user. At the optimal intersection of throughput and responsiveness, HGX B300 yields a 30x overall increase in AI factory output performance compared to the NVIDIA Hopper architecture for maximum token revenue.

Scalable Training for Large AI Models

Projected performance subject to change. Perf per GPU, FP8, 16K BS, 16K sequence length.

Next-Level Training Performance

The HGX B300 platform delivers up to 2.6x higher training performance for large language models such as DeepSeek-R1. With over 2 TB of high-speed memory and 14.4 TB/s of NVLink Switch bandwidth, it enables massive-scale model training and high-throughput inter-GPU communication.

Accelerating HGX With NVIDIA Networking

The data center is the new unit of computing, and networking plays an integral role in scaling application performance across it. Paired with NVIDIA Quantum InfiniBand, HGX delivers world-class performance and efficiency, which ensures the full utilization of computing resources.

For AI cloud data centers that deploy Ethernet, HGX is best used with the NVIDIA Spectrum-X networking platform, which powers the highest AI performance over Ethernet. It features Spectrum-X switches and NVIDIA SuperNIC for optimal resource utilization and performance isolation, delivering consistent, predictable outcomes for thousands of simultaneous AI jobs at every scale. Spectrum-X enables advanced cloud multi-tenancy and zero-trust security. As a reference design, NVIDIA has designed Israel-1, a hyperscale generative AI supercomputer built with Dell PowerEdge XE9680 servers based on the NVIDIA HGX 8-GPU platform, BlueField-3 SuperNICs, and Spectrum-4 switches.

NVIDIA HGX Specifications

NVIDIA HGX is available in single baseboards with four or eight Hopper SXMs or eight NVIDIA Blackwell or NVIDIA Blackwell Ultra SXMs. These powerful combinations of hardware and software lay the foundation for unprecedented AI supercomputing performance.

Blackwell
Hopper

	HGX B300	HGX B200
Form Factor	8x NVIDIA Blackwell Ultra SXM	8x NVIDIA Blackwell SXM
FP4 Tensor Core¹	144 PFLOPS \| 108 PFLOPS	144 PFLOPS \| 72 PFLOPS
FP8/FP6 Tensor Core²	72 PFLOPS	72 PFLOPS
INT8 Tensor Core²	3 POPS	72 POPS
FP16/BF16 Tensor Core²	36 PFLOPS	36 PFLOPS
TF32 Tensor Core²	18 PFLOPS	18 PFLOPS
FP32	600 TFLOPS	600 TFLOPS
FP64/FP64 Tensor Core	10 TFLOPS	296 TFLOPS
Total Memory	2.1 TB	1.4 TB
NVIDIA NVLink	Fifth generation	Fifth generation
NVIDIA NVLink Switch™	NVLink 5 Switch	NVLink 5 Switch
NVLink GPU-to-GPU Bandwidth	1.8 TB/s	1.8 TB/s
Total NVLink Bandwidth	14.4 TB/s	14.4 TB/s
Networking Bandwidth	1.6 TB/s	0.8 TB/s
Attention Performance³	2x	1x

1. Specification in Sparse | Dense
2. Specification in Sparse. Dense is ½ sparse spec shown.
3. vs. Blackwell.

Read the NVIDIA Blackwell Ultra Datasheet

Read the NVIDIA Blackwell Datasheet

	HGX H200
	4-GPU	8-GPU
Form Factor	4x NVIDIA H200 SXM	8x NVIDIA H200 SXM
FP8 Tensor Core*	16 PFLOPS	32 PFLOPS
INT8 Tensor Core*	16 POPS	32 POPS
FP16/BF16 Tensor Core*	8 PFLOPS	16 PFLOPS
TF32 Tensor Core*	4 PFLOPS	8 PFLOPS
FP32	270 TFLOPS	540 TFLOPS
FP64	140 TFLOPS	270 TFLOPS
FP64 Tensor Core	270 TFLOPS	540 TFLOPS
Total Memory	564 GB HBM3E	1.1 TB HBM3E
GPU Aggregate Bandwidth	19 TB/s	38 TB/s
NVLink	Fourth generation	Fourth generation
NVSwitch	N/A	NVLink 4 Switch
NVSwitch GPU-to-GPU Bandwidth	N/A	900 GB/s
Total Aggregate Bandwidth	3.6 TB/s	7.2 TB/s
Networking Bandwidth	0.4 TB/s	0.8 TB/s

	HGX H100
	4-GPU	8-GPU
Form Factor	4x NVIDIA H100 SXM	8x NVIDIA H100 SXM
FP8 Tensor Core*	16 PFLOPS	32 PFLOPS
INT8 Tensor Core*	16 POPS	32 POPS
FP16/BF16 Tensor Core*	8 PFLOPS	16 PFLOPS
TF32 Tensor Core*	4 PFLOPS	8 PFLOPS
FP32	270 TFLOPS	540 TFLOPS
FP64	140 TFLOPS	270 TFLOPS
FP64 Tensor Core	270 TFLOPS	540 TFLOPS
Total Memory	320 GB HBM3	640 GB HBM3
GPU Aggregate Bandwidth	13 TB/s	27 TB/s
NVLink	Fourth generation	Fourth generation
NVSwitch	N/A	NVLink 4 Switch
NVSwitch GPU-to-GPU Bandwidth	N/A	900 GB/s
Total Aggregate Bandwidth	3.6 TB/s	7.2 TB/s
Networking Bandwidth	0.4 TB/s	0.8 TB/s

* With sparsity

Read the NVIDIA HGX H100 and HGX H200 Datasheet

Learn more about the NVIDIA Blackwell architecture.

Learn More