File size: 1,105 Bytes
9d41769
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---
base_model: MiniMaxAI/MiniMax-M2.5
base_model_relation: quantized
license: other
license_name: modified-mit
license_link: LICENSE
tags:
- gguf
- quantized
- llama.cpp
---

# MiniMax-M2.5 GGUF

GGUF quantization of `MiniMaxAI/MiniMax-M2.5`, created with `llama.cpp`.

## Model Details

| Property | Value |
| --- | --- |
| Base model | MiniMaxAI/MiniMax-M2.5 |
| Architecture | Mixture of Experts (MoE) |
| Total parameters | 230B |
| Active parameters | 10B per token |
| Layers | 62 |
| Total experts | 256 |
| Active experts per token | 8 |
| Source precision | FP8 (`float8_e4m3fn`) |

## Available Quantizations

| Quantization | Size | Description |
| --- | --- | --- |
| Q6_K | 175 GB | 6-bit K-quant, strong quality/size balance |

## Usage

These GGUFs can be used with `llama.cpp` and compatible frontends.

```bash
# Example with llama-cli
llama-cli -m MiniMax-M2.5.Q6_K.gguf -p "Hello" -n 128
```

## Notes

- The source model uses FP8 (`float8_e4m3fn`) precision.
- This is a large MoE model and requires significant memory.
- Quantized from the official `MiniMaxAI/MiniMax-M2.5` weights.