Apertus-8B-Instruct-2509-bnb-8bit

This is an INT8 dynamically quantized version of swiss-ai/Apertus-8B-Instruct-2509 using llm-compressor.

This version used the fineweb-edu-score-2 dataset for calibration.

Quantization Details

  • Quantization Scheme: W8A8
  • Method: Dynamic quantization of weights and activations to INT8 (W8A8) format
  • Targets: All Linear layers
  • Ignored Layers: lm_head (kept in higher precision for better output quality)
  • Tool: llm-compressor
Downloads last month
30
Safetensors
Model size
8B params
Tensor type
F32
BF16
I8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support