Update README.md
Browse files
README.md
CHANGED
|
@@ -12,11 +12,25 @@ tags:
|
|
| 12 |
|
| 13 |
exllamav3 quantizations of [MiniMaxAI/MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5).
|
| 14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
[2.00 bpw h6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/tree/2.00bpw_H6) 61.054 GiB
|
| 16 |
[3.00 bpw h6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/tree/3.00bpw_H6) 81.613 GiB
|
| 17 |
[4.00 bpw h6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/tree/4.00bpw_H6) 108.087 GiB
|
| 18 |
[5.00 bpw h6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/tree/5.00bpw_H6) 134.561 GiB
|
| 19 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
[measurement.json - 2.0bpw_H6 vs 3.0bpw_H6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/blob/main/measurement_MiniMaxAI_MiniMax-M2.5-2.0-3.0.json)
|
| 22 |
[measurement.json - 3.0bpw_H6 vs 4.0bpw_H6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/blob/main/measurement_MiniMaxAI_MiniMax-M2.5-3.0-4.0.json)
|
|
|
|
| 12 |
|
| 13 |
exllamav3 quantizations of [MiniMaxAI/MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5).
|
| 14 |
|
| 15 |
+
### Optimized quants
|
| 16 |
+
|
| 17 |
+
[2.10 bpw h6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/tree/2.10bpw_H6) 57.292 GiB
|
| 18 |
+
[2.50 bpw h6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/tree/2.50bpw_H6) 67.838 GiB
|
| 19 |
+
[3.06 bpw h6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/tree/3.06bpw_H6) 82.656 GiB
|
| 20 |
+
|
| 21 |
+
### Straight quants
|
| 22 |
+
|
| 23 |
+
As the charts below will show, a 4bpw or 5bpw is still better than an optimized 3.06 or 2.5bpw quant.
|
| 24 |
+
|
| 25 |
[2.00 bpw h6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/tree/2.00bpw_H6) 61.054 GiB
|
| 26 |
[3.00 bpw h6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/tree/3.00bpw_H6) 81.613 GiB
|
| 27 |
[4.00 bpw h6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/tree/4.00bpw_H6) 108.087 GiB
|
| 28 |
[5.00 bpw h6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/tree/5.00bpw_H6) 134.561 GiB
|
| 29 |
|
| 30 |
+
### K/L-D and PPL graphs
|
| 31 |
+
|
| 32 |
+

|
| 33 |
+

|
| 34 |
|
| 35 |
[measurement.json - 2.0bpw_H6 vs 3.0bpw_H6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/blob/main/measurement_MiniMaxAI_MiniMax-M2.5-2.0-3.0.json)
|
| 36 |
[measurement.json - 3.0bpw_H6 vs 4.0bpw_H6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/blob/main/measurement_MiniMaxAI_MiniMax-M2.5-3.0-4.0.json)
|