~/gpu/a6000 vs rtx-6000-ada

RTX A6000 manufacturerRTX A6000vsRTX 6000 Ada manufacturerRTX 6000 Ada

Head-to-head for local LLM inference. The honest comparison: VRAM, bandwidth, compute, and which of the 30 catalog models actually fit on each.

The specs.

$ diff specs a6000 rtx-6000-ada
Stat
a6000
rtx-6000-ada
Δ
VRAM
48 GB
48 GB
0%
Memory bandwidth
768 GB/s
960 GB/s
+25%
FP16 compute
155 TFLOPS
364 TFLOPS
+135%
Weights budget at 8K ctx
37 GB
37 GB
0%

Model fit difference.

$ models that change with the card
Fits on both
24of 30
Only on a6000
0
Only on rtx-6000-ada
0

// showing 12 of 30 models; differing fits first

Model
a6000
rtx-6000-ada
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsQ3_K_M
fitsQ3_K_M
overQ4_K_M
overQ4_K_M
fitsFP16/BF16
fitsFP16/BF16
fitsQ8_0
fitsQ8_0
fitsQ8_0
fitsQ8_0
fitsQ3_K_M
fitsQ3_K_M
fitsQ8_0
fitsQ8_0
fitsFP16/BF16
fitsFP16/BF16
fitsQ8_0
fitsQ8_0

Which one wins for…

$ ./recommend --by-workload
More VRAM headroom

Tied at 48 GB. Choose on bandwidth or price.

Faster decode (bandwidth)

RTX 6000 Ada by +25%.

Faster prefill (compute)

RTX 6000 Ada by +135% TFLOPS.

Catalog models that fit

Tied: 24 of 30 fit on each.

Drill into either card.

$ ./vrambudget --gpu

Discussion.

$ gh discussion list

// sign in with github to leave a comment. threads live in the repo's discussions tab.