| | --- |
| | license: other |
| | library_name: diffusers |
| | tags: |
| | - text-to-image |
| | - z-image |
| | - diffusers |
| | - quantized |
| | - int8 |
| | - sdnq |
| | - safetensors |
| | pipeline_tag: text-to-image |
| | --- |
| | |
| | # Tongyi-MAI/Z-Image-Turbo - Quantized (8-bit) |
| |
|
| | ## Overview |
| | This is a **quantized version** of [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo). |
| |
|
| | All components have been quantized to 8-bit using SDNQ, while preserving the original folder structure for seamless integration. |
| |
|
| | ## Architecture |
| | - **Pipeline**: ZImagePipeline |
| | - **Main component**: ZImageTransformer2DModel |
| | - **Quantization**: 8-bit |
| |
|
| | ## Usage |
| |
|
| | ```python |
| | import torch |
| | from diffusers import ZImagePipeline, AutoencoderKL, FlowMatchEulerDiscreteScheduler |
| | from transformers import Qwen3Model, AutoTokenizer |
| | from sdnq import load_sdnq_model |
| | |
| | model_path = "Tongyi-MAI_Z-Image-Turbo-int8" |
| | |
| | # Load transformer with SDNQ (quantized to 8-bit) |
| | transformer = load_sdnq_model( |
| | f"{model_path}/transformer", |
| | model_cls=ZImageTransformer2DModel, |
| | device="cpu" |
| | ) |
| | |
| | # Load other components from this model (all included!) |
| | vae = AutoencoderKL.from_pretrained(f"{model_path}/vae", torch_dtype=torch.float16) |
| | text_encoder = Qwen3Model.from_pretrained(f"{model_path}/text_encoder", torch_dtype=torch.float16) |
| | tokenizer = AutoTokenizer.from_pretrained(f"{model_path}/tokenizer") |
| | scheduler = FlowMatchEulerDiscreteScheduler.from_pretrained(f"{model_path}/scheduler") |
| | |
| | # Construct pipeline |
| | pipe = ZImagePipeline( |
| | transformer=transformer, |
| | vae=vae, |
| | text_encoder=text_encoder, |
| | tokenizer=tokenizer, |
| | scheduler=scheduler, |
| | ) |
| | |
| | pipe.to("cuda") |
| | |
| | # Generate an image |
| | image = pipe( |
| | prompt="A serene mountain landscape at sunrise", |
| | num_inference_steps=20, |
| | ).images[0] |
| | image.save("output.png") |
| | ``` |
| |
|
| | ## Components |
| |
|
| | - ✅ **transformer** (ZImageTransformer2DModel) - Quantized to 8-bit |
| | - ✅ **vae** (AutoencoderKL) - Quantized to 8-bit |
| |
|
| | **Note**: Some components are included unquantized due to SDNQ library limitations: |
| | - 📦 **text_encoder** - Included unquantized (SDNQ bug workaround) |
| | |
| | |
| | ## Quantization Details |
| | - **Original model**: [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) |
| | - **Quantization**: 8-bit |
| | - **Quantizer**: SDNQ |
| | - **Date**: 2026-01-18 13:41:50 |
| | |
| | ## Size Reduction |
| | - Original: ~30GB (estimated) |
| | - Quantized: See individual component sizes |
| | |
| | ## Notes |
| | - This is a complete drop-in replacement - all components included |
| | - SDNQ quantization provides excellent quality at reduced size |
| | - Requires `sdnq` library to be installed: `pip install sdnq` |
| | - Quality loss is minimal with 8-bit quantization |
| | - Some components may be included unquantized due to library limitations |
| | |
| | --- |
| | Quantized with [BugQuant](https://github.com/yourusername/BugQuant) |
| | |