~/model/meta-llama/Llama-3.2-3B-Instruct

Llama 3.2 3B

Edge-tier. Assistant-quality on a 6GB GPU. 128K context.

Parameters

3.21B

Family

Meta

Context

128K tokens

FP16 weights

6.4GB

// where you can run it

llama3.2:3b LM Studio vLLM MLX oMLX

// hugging face stats (cached daily)

2.2M downloads · 2K likes · license: llama3.2 · updated 1 year ago

What you need to run this.

$ ./vrambudget --model llama-3-2-3b --by quant

// budgets shown at ctx 8K, concurrency 1, 15% safety headroom. Tune in the calculator →

Alternatives at this size.

$ grep --params similar catalog.json

$ similarPhi-4 Mini3.8BMicrosoft $ similarGemma 4 E4B4BGoogle $ similarLlama 3.2 1B1.23BMeta $ similarQwen 2.5 7B7BAlibaba

Discussion.

$ gh discussion list

// sign in with github to leave a comment. threads live in the repo's discussions tab.