Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
danielhanchenΒ 
posted an update 1 day ago
Post
4191
You can now run Qwen3.5 locally! πŸ’œ
Qwen3.5-397B-A17B is an open MoE vision reasoning LLM for agentic coding & chat. It performs on par with Gemini 3 Pro, Claude Opus 4.5 & GPT-5.2.

GGUF: unsloth/Qwen3.5-397B-A17B-GGUF
Run Dynamic 3-bit on a 192GB Mac for 20 tokens/s.

Guide: https://unsloth.ai/docs/models/qwen3.5

Cool! Really thanks for the quantization!

A magnificent achievement! Thanks

Amazing! Thanks for expanding the access to cool models! πŸ‘

Any steps to use or run it in my local

So I never tried these big models but... Will offloading to DDR4 ram get me at least 5tk/s? I have two rtx 3090's so definitely not enough for full vram offloading. haha

Looked at the quants, those are some big quants ToT

I'd like to add that you can run Qwen 3.5 on 128G Mac using @ubergarm smol-IQ2_XS quant, here's my experience trying it locally: https://huggingface.co/ubergarm/Qwen3.5-397B-A17B-GGUF/discussions/2

Β·

Thanks for showing how to fit the imatrix optimized ubergarm/Qwen3.5-397B-A17B-GGUF on a 128G mac and get it working with images and vibe coding with plenty of context! πŸš€