TheCodingBug
/

Z-Image-Turbo-int8

Model card Files Files and versions

Z-Image-Turbo-int8 / README.md

TheCodingBug's picture

Upload folder using huggingface_hub

8622ebe verified about 2 months ago

|

history blame contribute delete

2.73 kB

	---
	license: other
	library_name: diffusers
	tags:
	- text-to-image
	- z-image
	- diffusers
	- quantized
	- int8
	- sdnq
	- safetensors
	pipeline_tag: text-to-image
	---

	# Tongyi-MAI/Z-Image-Turbo - Quantized (8-bit)

	## Overview
	This is a quantized version of [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo).

	All components have been quantized to 8-bit using SDNQ, while preserving the original folder structure for seamless integration.

	## Architecture
	- Pipeline: ZImagePipeline
	- Main component: ZImageTransformer2DModel
	- Quantization: 8-bit

	## Usage

	```python
	import torch
	from diffusers import ZImagePipeline, AutoencoderKL, FlowMatchEulerDiscreteScheduler
	from transformers import Qwen3Model, AutoTokenizer
	from sdnq import load_sdnq_model

	model_path = "Tongyi-MAI_Z-Image-Turbo-int8"

	# Load transformer with SDNQ (quantized to 8-bit)
	transformer = load_sdnq_model(
	f"{model_path}/transformer",
	model_cls=ZImageTransformer2DModel,
	device="cpu"
	)

	# Load other components from this model (all included!)
	vae = AutoencoderKL.from_pretrained(f"{model_path}/vae", torch_dtype=torch.float16)
	text_encoder = Qwen3Model.from_pretrained(f"{model_path}/text_encoder", torch_dtype=torch.float16)
	tokenizer = AutoTokenizer.from_pretrained(f"{model_path}/tokenizer")
	scheduler = FlowMatchEulerDiscreteScheduler.from_pretrained(f"{model_path}/scheduler")

	# Construct pipeline
	pipe = ZImagePipeline(
	transformer=transformer,
	vae=vae,
	text_encoder=text_encoder,
	tokenizer=tokenizer,
	scheduler=scheduler,
	)

	pipe.to("cuda")

	# Generate an image
	image = pipe(
	prompt="A serene mountain landscape at sunrise",
	num_inference_steps=20,
	).images[0]
	image.save("output.png")
	```

	## Components

	- ✅ transformer (ZImageTransformer2DModel) - Quantized to 8-bit
	- ✅ vae (AutoencoderKL) - Quantized to 8-bit

	Note: Some components are included unquantized due to SDNQ library limitations:
	- 📦 text_encoder - Included unquantized (SDNQ bug workaround)


	## Quantization Details
	- Original model: [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo)
	- Quantization: 8-bit
	- Quantizer: SDNQ
	- Date: 2026-01-18 13:41:50

	## Size Reduction
	- Original: ~30GB (estimated)
	- Quantized: See individual component sizes

	## Notes
	- This is a complete drop-in replacement - all components included
	- SDNQ quantization provides excellent quality at reduced size
	- Requires `sdnq` library to be installed: `pip install sdnq`
	- Quality loss is minimal with 8-bit quantization
	- Some components may be included unquantized due to library limitations

	---
	Quantized with [BugQuant](https://github.com/yourusername/BugQuant)