~/gpu/b200 vs h100-nvl-2x

B200 manufacturerB200vs2× H100 NVL manufacturer2× H100 NVL

Head-to-head for local LLM inference. The honest comparison: VRAM, bandwidth, compute, and which of the 30 catalog models actually fit on each.

The specs.

$ diff specs b200 h100-nvl-2x
Stat
b200
h100-nvl-2x
Δ
VRAM
192 GB
188 GB
-2%
Memory bandwidth
8,000 GB/s
3,938 GB/s
-51%
FP16 compute
2250 TFLOPS
1979 TFLOPS
-12%
Weights budget at 8K ctx
154 GB
151 GB
-2%

Model fit difference.

$ models that change with the card
Fits on both
27of 30
Only on b200
0
Only on h100-nvl-2x
0

// showing 12 of 30 models; differing fits first

Model
b200
h100-nvl-2x
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
overQ4_K_M
overQ4_K_M
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16

Which one wins for…

$ ./recommend --by-workload
More VRAM headroom

B200 has 4 GB more.

Faster decode (bandwidth)

B200 by +103%.

Faster prefill (compute)

B200 by +14% TFLOPS.

Catalog models that fit

Tied: 27 of 30 fit on each.

Drill into either card.

$ ./vrambudget --gpu

Discussion.

$ gh discussion list

// sign in with github to leave a comment. threads live in the repo's discussions tab.