Technical Specifications
The NVIDIA H200 sets a new standard for AI computing. Below are the detailed specifications and a comparison with the previous generation H100.
| Feature | NVIDIA H200 | NVIDIA H100 |
|---|---|---|
| Memory Capacity | 141 GB HBM3e | 80 GB HBM3 |
| Memory Bandwidth | 4.8 TB/s | 3.35 TB/s |
| Architecture | NVIDIA Hopper™ | NVIDIA Hopper™ |
| Llama2 70B Inference | 1.9x Faster | Baseline |
| GPT-3 175B Inference | 1.6x Faster | Baseline |
Key Features
HBM3e Memory
The H200 is the first GPU to feature HBM3e memory, providing 141GB of capacity. This allows for larger models to fit into memory, reducing the need for model parallelism and communication overhead.
4.8 TB/s Bandwidth
With 4.8 TB/s of memory bandwidth, the H200 can feed data to its computational cores at unprecedented speeds, significantly accelerating memory-bound workloads like LLM inference.
Hopper Architecture
Built on the NVIDIA Hopper architecture, the H200 features the Transformer Engine, which intelligently manages precision (FP8, FP16, BF16) to optimize performance and efficiency for AI models.
NVLink Switch System
The H200 supports the fourth-generation NVLink, enabling high-speed communication between GPUs. This is critical for scaling up training and inference across thousands of GPUs.