~/gpu/rtx-4070-ti-s vs rtx-4080-s

RTX 4070 Ti Super manufacturerRTX 4070 Ti SupervsRTX 4080 Super manufacturerRTX 4080 Super

Head-to-head for local LLM inference. The honest comparison: VRAM, bandwidth, compute, and which of the 30 catalog models actually fit on each.

The specs.

$ diff specs rtx-4070-ti-s rtx-4080-s
Stat
rtx-4070-ti-s
rtx-4080-s
Δ
VRAM
16 GB
16 GB
0%
Memory bandwidth
672 GB/s
736 GB/s
+10%
FP16 compute
350 TFLOPS
410 TFLOPS
+17%
Weights budget at 8K ctx
12 GB
12 GB
0%

Model fit difference.

$ models that change with the card
Fits on both
16of 30
Only on rtx-4070-ti-s
0
Only on rtx-4080-s
0

// showing 12 of 30 models; differing fits first

Model
rtx-4070-ti-s
rtx-4080-s
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsQ8_0
fitsQ8_0
overQ4_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M
fitsQ8_0
fitsQ8_0
overQ4_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M
fitsQ8_0
fitsQ8_0
fitsQ3_K_M
fitsQ3_K_M

Which one wins for…

$ ./recommend --by-workload
More VRAM headroom

Tied at 16 GB. Choose on bandwidth or price.

Faster decode (bandwidth)

RTX 4080 Super by +10%.

Faster prefill (compute)

RTX 4080 Super by +17% TFLOPS.

Catalog models that fit

Tied: 16 of 30 fit on each.

Drill into either card.

$ ./vrambudget --gpu

Discussion.

$ gh discussion list

// sign in with github to leave a comment. threads live in the repo's discussions tab.