~/gpu/arc-b580 vs rtx-4060

Arc B580 manufacturerArc B580vsRTX 4060 manufacturerRTX 4060

Head-to-head for local LLM inference. The honest comparison: VRAM, bandwidth, compute, and which of the 30 catalog models actually fit on each.

The specs.

$ diff specs arc-b580 rtx-4060
Stat
arc-b580
rtx-4060
Δ
VRAM
12 GB
8 GB
-33%
Memory bandwidth
456 GB/s
272 GB/s
-40%
FP16 compute
56 TFLOPS
121 TFLOPS
+116%
Weights budget at 8K ctx
8.3 GB
5.1 GB
-39%

Model fit difference.

$ models that change with the card
Fits on both
10of 30
Only on arc-b580
2
Only on rtx-4060
0

// showing 12 of 30 models; differing fits first

Model
arc-b580
rtx-4060
fitsQ4_K_M
overQ4_K_M
fitsAWQ 4-BIT
overQ4_K_M
fitsFP16/BF16
fitsFP16/BF16
fitsFP16/BF16
fitsQ8_0
fitsFP8/INT8
fitsQ4_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M
fitsQ8_0
fitsQ5_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M
overQ4_K_M

Which one wins for…

$ ./recommend --by-workload
More VRAM headroom

Arc B580 has 4 GB more.

Faster decode (bandwidth)

Arc B580 by +68%.

Faster prefill (compute)

RTX 4060 by +116% TFLOPS.

Catalog models that fit

Arc B580: 12 fit · RTX 4060: 10.

Drill into either card.

$ ./vrambudget --gpu

Discussion.

$ gh discussion list

// sign in with github to leave a comment. threads live in the repo's discussions tab.