DeepSeek

DeepSeek V3

671B MoE, 37B active. GPT-4 class on a fraction of the inference cost. Multi-node serving at FP16.

Parameters
671B
Family
DeepSeek
Context
128K tokens
FP16 weights
1342GB
// where you can run it
deepseek-v3LM StudiovLLMMLXoMLX
// hugging face stats (cached daily)
1.2M downloads · 4K likes · updated 1 year ago

What you need to run this.

$ ./vrambudget --model deepseek-v3 --by quant

// budgets shown at ctx 8K, concurrency 1, 15% safety headroom. Tune in the calculator →

Alternatives at this size.

$ grep --params similar catalog.json

Discussion.

$ gh discussion list

// sign in with github to leave a comment. threads live in the repo's discussions tab.