MiniMax-M2-exl3 / README.md
turboderp's picture
Update README.md
9c500e6 verified
|
raw
history blame
1.41 kB
metadata
license: mit
base_model: MiniMaxAI/MiniMax-M2
base_model_relation: quantized
quantized_by: turboderp
tags:
  - exl3

EXL3 quants of MiniMax-M2

⚠️ Requires ExLlamaV3 v0.0.12 (or v0.0.11 dev branch)

Base bitrates:

2.00 bits per weight
3.00 bits per weight
4.00 bits per weight

Optimized:

2.04 bits per weight
2.27 bits per weight
3.04 bits per weight
3.50 bits per weight
4.03 bits per weight

. KL-div ppl HumanEval@1
2.00 bpw 0.400 10.92 80.5%
2.04 bpw 0.297 10.23 87.1%
2.27 bpw 0.252 9.78 88.4%
3.00 bpw 0.141 8.99 87.8%
3.04 bpw 0.117 8.73 87.2%
3.50 bpw 0.094 8.78 88.4%
4.00 bpw 0.087 8.58 89.6%
4.03 bpw 0.077 8.61 87.8%
original - 8.51 87.2%¹

¹ Unconfirmed