~/gpu/rtx-3090 vs rtx-5090

RTX 3090vsRTX 5090

Head-to-head for local LLM inference. The honest comparison: VRAM, bandwidth, compute, and which of the 30 catalog models actually fit on each.

The specs.

$ diff specs rtx-3090 rtx-5090

Stat

rtx-3090

rtx-5090

VRAM

24 GB

32 GB

+33%

Memory bandwidth

936 GB/s

1,792 GB/s

+91%

FP16 compute

142 TFLOPS

838 TFLOPS

+490%

Weights budget at 8K ctx

18 GB

25 GB

+39%

Model fit difference.

$ models that change with the card

Fits on both

21of 30

Only on rtx-3090

Only on rtx-5090

// showing 12 of 30 models; differing fits first

Model

rtx-3090

rtx-5090

Mixtral 8x7B46.7B

overQ4_K_M

fitsAWQ 4-BIT

fitsFP16/BF16

fitsFP16/BF16

fitsFP16/BF16

overQ4_K_M

overQ4_K_M

fitsFP16/BF16

fitsAWQ 4-BIT

fitsQ5_K_M

Qwen 2.5 Coder 32B32.5B

fitsAWQ 4-BIT

fitsQ5_K_M

Qwen 2.5 72B72.7B

overQ4_K_M

Qwen3 30B A3B30.5B

fitsQ4_K_M

fitsQ5_K_M

Qwen 3.5 9B9B

fitsFP16/BF16

Which one wins for…

$ ./recommend --by-workload

More VRAM headroom

RTX 5090 has 8 GB more.

Faster decode (bandwidth)

RTX 5090 by +91%.

Faster prefill (compute)

RTX 5090 by +490% TFLOPS.

Catalog models that fit

RTX 5090: 22 fit · RTX 3090: 21.

Discussion.

$ gh discussion list

// sign in with github to leave a comment. threads live in the repo's discussions tab.

RTX 3090 manufacturerRTX 3090vsRTX 5090 manufacturerRTX 5090

The specs.

Model fit difference.

Which one wins for…

Drill into either card.

Discussion.

RTX 3090vsRTX 5090