@danielhanchen on Hugging Face: "You can now run Qwen3.5 locally! 💜 Qwen3.5-397B-A17B is an open MoE vision…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

danielhanchen

posted an update 1 day ago

Post

4191

You can now run Qwen3.5 locally! 💜
Qwen3.5-397B-A17B is an open MoE vision reasoning LLM for agentic coding & chat. It performs on par with Gemini 3 Pro, Claude Opus 4.5 & GPT-5.2.

GGUF: unsloth/Qwen3.5-397B-A17B-GGUF
Run Dynamic 3-bit on a 192GB Mac for 20 tokens/s.

Guide: https://unsloth.ai/docs/models/qwen3.5

Ujjwal-Tyagi

1 day ago

Cool! Really thanks for the quantization!

Dima-lion

1 day ago

A magnificent achievement! Thanks

ClaireLee2429

1 day ago

Amazing! Thanks for expanding the access to cool models! 👏

surendranara

about 18 hours ago

Any steps to use or run it in my local

ThijsL202

about 11 hours ago

•

edited about 11 hours ago

So I never tried these big models but... Will offloading to DDR4 ram get me at least 5tk/s? I have two rtx 3090's so definitely not enough for full vram offloading. haha

Looked at the quants, those are some big quants ToT

tarruda

about 8 hours ago

I'd like to add that you can run Qwen 3.5 on 128G Mac using @ubergarm smol-IQ2_XS quant, here's my experience trying it locally: https://huggingface.co/ubergarm/Qwen3.5-397B-A17B-GGUF/discussions/2

ubergarm

about 8 hours ago

Thanks for showing how to fit the imatrix optimized ubergarm/Qwen3.5-397B-A17B-GGUF on a 128G mac and get it working with images and vibe coding with plenty of context! 🚀

In this post