~/gpu/rtx-6000-ada vs rtx-6000-pro

RTX 6000 Ada manufacturerRTX 6000 AdavsRTX Pro 6000 manufacturerRTX Pro 6000

Head-to-head for local LLM inference. The honest comparison: VRAM, bandwidth, compute, and which of the 30 catalog models actually fit on each.

The specs.

$ diff specs rtx-6000-ada rtx-6000-pro
Stat
rtx-6000-ada
rtx-6000-pro
Δ
VRAM
48 GB
96 GB
+100%
Memory bandwidth
960 GB/s
1,792 GB/s
+87%
FP16 compute
364 TFLOPS
510 TFLOPS
+40%
Weights budget at 8K ctx
37 GB
76 GB
+105%

Model fit difference.

$ models that change with the card
Fits on both
24of 30
Only on rtx-6000-ada
0
Only on rtx-6000-pro
3

// showing 12 of 30 models; differing fits first

Model
rtx-6000-ada
rtx-6000-pro
overQ4_K_M
fitsQ4_K_M
overQ4_K_M
fitsAWQ 4-BIT
overQ4_K_M
fitsQ5_K_M
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsQ3_K_M
fitsQ8_0
overQ4_K_M
overQ4_K_M
fitsFP16/BF16
fitsFP16/BF16
fitsQ8_0
fitsFP16/BF16
fitsQ8_0
fitsFP16/BF16
fitsQ3_K_M
fitsFP8/INT8

Which one wins for…

$ ./recommend --by-workload
More VRAM headroom

RTX Pro 6000 has 48 GB more.

Faster decode (bandwidth)

RTX Pro 6000 by +87%.

Faster prefill (compute)

RTX Pro 6000 by +40% TFLOPS.

Catalog models that fit

RTX Pro 6000: 27 fit · RTX 6000 Ada: 24.

Drill into either card.

$ ./vrambudget --gpu

Discussion.

$ gh discussion list

// sign in with github to leave a comment. threads live in the repo's discussions tab.