jgerster0
/

Apertus-8B-Instruct-2509-bnb-8bit

8-bit precision

Model card Files Files and versions

Apertus-8B-Instruct-2509-bnb-8bit

This is an INT8 dynamically quantized version of swiss-ai/Apertus-8B-Instruct-2509 using llm-compressor.

This version used the fineweb-edu-score-2 dataset for calibration.

Quantization Details

Quantization Scheme: W8A8
Method: Dynamic quantization of weights and activations to INT8 (W8A8) format
Targets: All Linear layers
Ignored Layers: lm_head (kept in higher precision for better output quality)
Tool: llm-compressor

Downloads last month: 30

Safetensors

Model size

8B params

Tensor type

F32

·

BF16

·

I8

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support