metadata
base_model: MiniMaxAI/MiniMax-M2.5
base_model_relation: quantized
license: other
license_name: modified-mit
license_link: LICENSE
tags:
- gguf
- quantized
- llama.cpp
MiniMax-M2.5 GGUF
GGUF quantization of MiniMaxAI/MiniMax-M2.5, created with llama.cpp.
Model Details
| Property | Value |
|---|---|
| Base model | MiniMaxAI/MiniMax-M2.5 |
| Architecture | Mixture of Experts (MoE) |
| Total parameters | 230B |
| Active parameters | 10B per token |
| Layers | 62 |
| Total experts | 256 |
| Active experts per token | 8 |
| Source precision | FP8 (float8_e4m3fn) |
Available Quantizations
| Quantization | Size | Description |
|---|---|---|
| Q6_K | 175 GB | 6-bit K-quant, strong quality/size balance |
Usage
These GGUFs can be used with llama.cpp and compatible frontends.
# Example with llama-cli
llama-cli -m MiniMax-M2.5.Q6_K.gguf -p "Hello" -n 128
Notes
- The source model uses FP8 (
float8_e4m3fn) precision. - This is a large MoE model and requires significant memory.
- Quantized from the official
MiniMaxAI/MiniMax-M2.5weights.