Update README.md
Browse files
README.md
CHANGED
|
@@ -5,6 +5,10 @@ base_model:
|
|
| 5 |
|
| 6 |
MoE-quants of GLM-5 (Q8_0 quantization default with routed experts quantized further)
|
| 7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
| Quant | Size | Mixture | PPL | KLD |
|
| 9 |
| :--------- | :--------- | :------- | :------- | :--------- |
|
| 10 |
| Q4_K_M | 432.80 GiB (4.93 BPW) | Q8_0-Q4_K-Q4_K-Q5_K | 8.7486 ± 0.17123 | TBD |
|
|
|
|
| 5 |
|
| 6 |
MoE-quants of GLM-5 (Q8_0 quantization default with routed experts quantized further)
|
| 7 |
|
| 8 |
+
Note: running this GGUF requires pulling and compiling this llama.cpp PR: https://github.com/ggml-org/llama.cpp/pull/19460
|
| 9 |
+
|
| 10 |
+
More quants to come soon.
|
| 11 |
+
|
| 12 |
| Quant | Size | Mixture | PPL | KLD |
|
| 13 |
| :--------- | :--------- | :------- | :------- | :--------- |
|
| 14 |
| Q4_K_M | 432.80 GiB (4.93 BPW) | Q8_0-Q4_K-Q4_K-Q5_K | 8.7486 ± 0.17123 | TBD |
|