VAEs for Image Generation
This repository hosts a curated collection of VAE checkpoints used by diffusion and transformer-based image generation pipelines.
Available VAEs
AutoencoderKL
| Model | Source | Latent Channels |
|---|---|---|
| SD21-VAE | Stable Diffusion 2.1 | 4 |
| SDXL-VAE | Stable Diffusion XL | 4 |
| SD35-VAE | Stable Diffusion 3.5 | 16 |
| FLUX1-VAE | FLUX.1 | 16 |
| FLUX2-VAE | FLUX.2 | 32 |
| SANA-VAE | SANA (DC-AE) | 32 |
| Qwen-VAE | Qwen-Image | 16 |
VQModel
| Model | Source | latent_channels | num_vq_embeddings | vq_embed_dim | sample_size |
|---|---|---|---|---|---|
| VQDIFFUSION-VQVAE | VQ-Diffusion (microsoft/vq-diffusion-ithq) | 256 | 4096 | 128 | 32 |
| IBQ-VQVAE-1024 | IBQ (TencentARC/SEED) | 256 | 1024 | 256 | 32 |
| IBQ-VQVAE-8192 | IBQ (TencentARC/SEED) | 256 | 8192 | 256 | 32 |
| IBQ-VQVAE-16384 | IBQ (TencentARC/SEED) | 256 | 16384 | 256 | 32 |
| IBQ-VQVAE-262144 | IBQ (TencentARC/SEED) | 256 | 262144 | 256 | 32 |
| MOVQGAN-67M | MOVQGAN | 4 | 16384 | 4 | 256 |
| MOVQGAN-102M | MOVQGAN | 4 | 16384 | 4 | 256 |
| MOVQGAN-270M | MOVQGAN | 4 | 16384 | 4 | 256 |
Diffusers usage
AutoencoderKL (SD, FLUX, SANA, Qwen, etc.):
from diffusers import AutoencoderKL
vae = AutoencoderKL.from_pretrained(
"BiliSakura/VAEs",
subfolder="SDXL-VAE",
)
VQModel (VQ-Diffusion, IBQ, MOVQGAN):
from diffusers import VQModel
vae = VQModel.from_pretrained(
"BiliSakura/VAEs",
subfolder="VQDIFFUSION-VQVAE",
)
Notes
- All models are VAE checkpoints intended for inference use in their corresponding pipelines.
- Latent channel count is listed to help match with the correct backbone.
- Downloads last month
- -