Add model-index with benchmark evaluations
Browse filesAdded structured evaluation results from README benchmark tables:
**Reasoning Benchmarks:**
- AIME25: 0.721
- AIME24: 0.775
- GPQA Diamond: 0.534
- LiveCodeBench: 0.548
**Instruct Benchmarks:**
- Arena Hard: 0.305
- WildBench: 56.8
- MATH Maj@1: 0.830
- MM MTBench: 7.83
**Base Model Benchmarks:**
- Multilingual MMLU: 0.652
- MATH CoT 2-Shot: 0.601
- AGIEval 5-shot: 0.511
- MMLU Redux 5-shot: 0.735
- MMLU 5-shot: 0.707
- TriviaQA 5-shot: 0.592
Total: 14 benchmarks across reasoning, instruction-following, and base capabilities.
This enables the model to appear in leaderboards and makes it easier to compare with other models.
Note: PR #6 only adds the `transformers` tag and doesn't conflict with this benchmark metadata addition.
README.md
CHANGED
|
@@ -16,11 +16,82 @@ license: apache-2.0
|
|
| 16 |
inference: false
|
| 17 |
base_model:
|
| 18 |
- mistralai/Ministral-3-3B-Base-2512
|
| 19 |
-
extra_gated_description:
|
| 20 |
-
|
| 21 |
-
our <a href="https://mistral.ai/terms/">Privacy Policy</a>.
|
| 22 |
tags:
|
| 23 |
- mistral-common
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
---
|
| 25 |
|
| 26 |
# Ministral 3 3B Instruct 2512
|
|
|
|
| 16 |
inference: false
|
| 17 |
base_model:
|
| 18 |
- mistralai/Ministral-3-3B-Base-2512
|
| 19 |
+
extra_gated_description: If you want to learn more about how we process your personal
|
| 20 |
+
data, please read our <a href="https://mistral.ai/terms/">Privacy Policy</a>.
|
|
|
|
| 21 |
tags:
|
| 22 |
- mistral-common
|
| 23 |
+
model-index:
|
| 24 |
+
- name: Ministral-3-3B-Instruct-2512
|
| 25 |
+
results:
|
| 26 |
+
- task:
|
| 27 |
+
type: text-generation
|
| 28 |
+
dataset:
|
| 29 |
+
name: Reasoning Benchmarks
|
| 30 |
+
type: benchmark
|
| 31 |
+
metrics:
|
| 32 |
+
- name: AIME25
|
| 33 |
+
type: aime_2025
|
| 34 |
+
value: 0.721
|
| 35 |
+
- name: AIME24
|
| 36 |
+
type: aime_2024
|
| 37 |
+
value: 0.775
|
| 38 |
+
- name: GPQA Diamond
|
| 39 |
+
type: gpqa_diamond
|
| 40 |
+
value: 0.534
|
| 41 |
+
- name: LiveCodeBench
|
| 42 |
+
type: live_code_bench
|
| 43 |
+
value: 0.548
|
| 44 |
+
source:
|
| 45 |
+
name: Model README - Reasoning Benchmarks
|
| 46 |
+
url: https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512
|
| 47 |
+
- task:
|
| 48 |
+
type: text-generation
|
| 49 |
+
dataset:
|
| 50 |
+
name: Instruct Benchmarks
|
| 51 |
+
type: benchmark
|
| 52 |
+
metrics:
|
| 53 |
+
- name: Arena Hard
|
| 54 |
+
type: arena_hard
|
| 55 |
+
value: 0.305
|
| 56 |
+
- name: WildBench
|
| 57 |
+
type: wild_bench
|
| 58 |
+
value: 56.8
|
| 59 |
+
- name: MATH Maj@1
|
| 60 |
+
type: math_maj1
|
| 61 |
+
value: 0.83
|
| 62 |
+
- name: MM MTBench
|
| 63 |
+
type: mm_mtbench
|
| 64 |
+
value: 7.83
|
| 65 |
+
source:
|
| 66 |
+
name: Model README - Instruct Benchmarks
|
| 67 |
+
url: https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512
|
| 68 |
+
- task:
|
| 69 |
+
type: text-generation
|
| 70 |
+
dataset:
|
| 71 |
+
name: Base Model Benchmarks
|
| 72 |
+
type: benchmark
|
| 73 |
+
metrics:
|
| 74 |
+
- name: Multilingual MMLU
|
| 75 |
+
type: multilingual_mmlu
|
| 76 |
+
value: 0.652
|
| 77 |
+
- name: MATH CoT 2-Shot
|
| 78 |
+
type: math_cot_2shot
|
| 79 |
+
value: 0.601
|
| 80 |
+
- name: AGIEval 5-shot
|
| 81 |
+
type: agieval_5shot
|
| 82 |
+
value: 0.511
|
| 83 |
+
- name: MMLU Redux 5-shot
|
| 84 |
+
type: mmlu_redux_5shot
|
| 85 |
+
value: 0.735
|
| 86 |
+
- name: MMLU 5-shot
|
| 87 |
+
type: mmlu_5shot
|
| 88 |
+
value: 0.707
|
| 89 |
+
- name: TriviaQA 5-shot
|
| 90 |
+
type: triviaqa_5shot
|
| 91 |
+
value: 0.592
|
| 92 |
+
source:
|
| 93 |
+
name: Model README - Base Model Benchmarks
|
| 94 |
+
url: https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512
|
| 95 |
---
|
| 96 |
|
| 97 |
# Ministral 3 3B Instruct 2512
|