LightOnOCR-2-1B-NVFP4

This is a NVFP4 variant of lightonai/LightOnOCR-2-1B, quantized with NVIDIA Model Optimizer PTQ for vLLM (modelopt_fp4) compatibility.

Upstream Model

Original model card: lightonai/LightOnOCR-2-1B
Original paper: LightOnOCR (arXiv:2601.14251)
Original blog: LightOnOCR-2 announcement

Quantization Notes

Quantization format: NVFP4
Runtime target: vLLM with quantization="modelopt_fp4"
Tooling: NVIDIA Model Optimizer
Selective policy used for current vLLM compatibility:
- Quantized: language model and vision projection linear layers
- Kept BF16: vision encoder and vision patch merger

Usage (vLLM)

vllm serve switzerchees/LightOnOCR-2-1B-NVFP4 \
  --quantization modelopt_fp4 \
  --limit-mm-per-prompt '{"image": 1}' \
  --mm-processor-cache-gb 0 \
  --no-enable-prefix-caching

License

This derivative keeps the same license as the upstream model: Apache-2.0. See the original model card for upstream attribution and details: lightonai/LightOnOCR-2-1B

Downloads last month: 53

Safetensors

Model size

0.8B params

Tensor type

BF16

F8_E4M3

Model tree for switzerchees/LightOnOCR-2-1B-NVFP4

Base model

lightonai/LightOnOCR-2-1B

Quantized

(9)

this model

Paper for switzerchees/LightOnOCR-2-1B-NVFP4

LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR

Paper • 2601.14251 • Published 28 days ago • 24