~/model/meta-llama/Llama-3.3-70B-Instruct

Llama 3.3 70B

Meta's late-2024 70B. Closes the gap to GPT-4o-mini at the same VRAM footprint as 3.1 70B.

Parameters

70.6B

Family

Meta

Context

128K tokens

FP16 weights

141GB

// where you can run it

llama3.3:70b LM Studio vLLM MLX oMLX

// hugging face stats (cached daily)

981K downloads · 3K likes · license: llama3.3 · updated 1 year ago

What you need to run this.

$ ./vrambudget --model llama-3-3-70b --by quant

// budgets shown at ctx 8K, concurrency 1, 15% safety headroom. Tune in the calculator →

Alternatives at this size.

$ grep --params similar catalog.json

$ similarQwen 2.5 72B72.7BAlibaba $ similarMixtral 8x7B46.7BMistral $ similarCommand R+104BCohere $ similarQwen 3.6 35B A3B35BAlibaba

Discussion.

$ gh discussion list

// sign in with github to leave a comment. threads live in the repo's discussions tab.