~/gpu/rtx-4090 vs rx-7900-xtx

RTX 4090 manufacturerRTX 4090vsRX 7900 XTX manufacturerRX 7900 XTX

Head-to-head for local LLM inference. The honest comparison: VRAM, bandwidth, compute, and which of the 30 catalog models actually fit on each.

The specs.

$ diff specs rtx-4090 rx-7900-xtx
Stat
rtx-4090
rx-7900-xtx
Δ
VRAM
24 GB
24 GB
0%
Memory bandwidth
1,008 GB/s
960 GB/s
-5%
FP16 compute
330 TFLOPS
122 TFLOPS
-63%
Weights budget at 8K ctx
18 GB
18 GB
0%

Model fit difference.

$ models that change with the card
Fits on both
21of 30
Only on rtx-4090
0
Only on rx-7900-xtx
0

// showing 12 of 30 models; differing fits first

Model
rtx-4090
rx-7900-xtx
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
overQ4_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M
fitsFP16/BF16
fitsFP16/BF16
fitsAWQ 4-BIT
fitsAWQ 4-BIT
fitsAWQ 4-BIT
fitsAWQ 4-BIT
overQ4_K_M
overQ4_K_M
fitsQ4_K_M
fitsQ4_K_M
fitsFP16/BF16
fitsFP16/BF16
fitsQ4_K_M
fitsQ4_K_M

Which one wins for…

$ ./recommend --by-workload
More VRAM headroom

Tied at 24 GB. Choose on bandwidth or price.

Faster decode (bandwidth)

RTX 4090 by +5%.

Faster prefill (compute)

RTX 4090 by +170% TFLOPS.

Catalog models that fit

Tied: 21 of 30 fit on each.

Drill into either card.

$ ./vrambudget --gpu

Discussion.

$ gh discussion list

// sign in with github to leave a comment. threads live in the repo's discussions tab.