MoE-quants of GLM-5 (Q8_0 quantization default with routed experts quantized further)
Note: running this GGUF requires pulling and compiling this llama.cpp PR: https://github.com/ggml-org/llama.cpp/pull/19460
More quants to come soon.
| Quant | Size | Mixture | PPL | KLD |
|---|---|---|---|---|
| Q4_K_M | 432.80 GiB (4.93 BPW) | Q8_0-Q4_K-Q4_K-Q5_K | 8.7486 ± 0.17123 | TBD |
- Downloads last month
- 43
Hardware compatibility
Log In
to add your hardware
4-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for AesSedai/GLM-5-GGUF
Base model
zai-org/GLM-5