Meta's late-2024 70B. Closes the gap to GPT-4o-mini at the same VRAM footprint as 3.1 70B.
// budgets shown at ctx 8K, concurrency 1, 15% safety headroom. Tune in the calculator →
// sign in with github to leave a comment. threads live in the repo's discussions tab.