~/gpu/b200 vs h100-nvl-2x

B200vs2× H100 NVL

Head-to-head for local LLM inference. The honest comparison: VRAM, bandwidth, compute, and which of the 30 catalog models actually fit on each.

The specs.

$ diff specs b200 h100-nvl-2x

Stat

b200

h100-nvl-2x

VRAM

192 GB

188 GB

-2%

Memory bandwidth

8,000 GB/s

3,938 GB/s

-51%

FP16 compute

2250 TFLOPS

1979 TFLOPS

-12%

Weights budget at 8K ctx

154 GB

151 GB

-2%

Model fit difference.

$ models that change with the card

Fits on both

27of 30

Only on b200

Only on h100-nvl-2x

// showing 12 of 30 models; differing fits first

Model

b200

h100-nvl-2x

fitsFP16/BF16

fitsFP16/BF16

fitsFP16/BF16

fitsFP16/BF16

overQ4_K_M

fitsFP16/BF16

fitsFP16/BF16

Qwen 2.5 Coder 32B32.5B

fitsFP16/BF16

fitsFP16/BF16

fitsFP16/BF16

fitsFP16/BF16

fitsFP16/BF16

Which one wins for…

$ ./recommend --by-workload

More VRAM headroom

B200 has 4 GB more.

Faster decode (bandwidth)

B200 by +103%.

Faster prefill (compute)

B200 by +14% TFLOPS.

Catalog models that fit

Tied: 27 of 30 fit on each.

Discussion.

$ gh discussion list

// sign in with github to leave a comment. threads live in the repo's discussions tab.

B200 manufacturerB200vs2× H100 NVL manufacturer2× H100 NVL

The specs.

Model fit difference.

Which one wins for…

Drill into either card.

Discussion.

B200vs2× H100 NVL