Technical Specifications

The NVIDIA H200 sets a new standard for AI computing. Below are the detailed specifications and a comparison with the previous generation H100.

FeatureNVIDIA H200NVIDIA H100
Memory Capacity141 GB HBM3e80 GB HBM3
Memory Bandwidth4.8 TB/s3.35 TB/s
ArchitectureNVIDIA Hopper™NVIDIA Hopper™
Llama2 70B Inference1.9x FasterBaseline
GPT-3 175B Inference1.6x FasterBaseline

Key Features

HBM3e Memory

The H200 is the first GPU to feature HBM3e memory, providing 141GB of capacity. This allows for larger models to fit into memory, reducing the need for model parallelism and communication overhead.

4.8 TB/s Bandwidth

With 4.8 TB/s of memory bandwidth, the H200 can feed data to its computational cores at unprecedented speeds, significantly accelerating memory-bound workloads like LLM inference.

Hopper Architecture

Built on the NVIDIA Hopper architecture, the H200 features the Transformer Engine, which intelligently manages precision (FP8, FP16, BF16) to optimize performance and efficiency for AI models.

NVLink Switch System

The H200 supports the fourth-generation NVLink, enabling high-speed communication between GPUs. This is critical for scaling up training and inference across thousands of GPUs.