VAEs for Image Generation

This repository hosts a curated collection of VAE checkpoints used by diffusion and transformer-based image generation pipelines.

Available VAEs

AutoencoderKL

Model Source Latent Channels
SD21-VAE Stable Diffusion 2.1 4
SDXL-VAE Stable Diffusion XL 4
SD35-VAE Stable Diffusion 3.5 16
FLUX1-VAE FLUX.1 16
FLUX2-VAE FLUX.2 32
SANA-VAE SANA (DC-AE) 32
Qwen-VAE Qwen-Image 16

VQModel

Model Source latent_channels num_vq_embeddings vq_embed_dim sample_size
VQDIFFUSION-VQVAE VQ-Diffusion (microsoft/vq-diffusion-ithq) 256 4096 128 32
IBQ-VQVAE-1024 IBQ (TencentARC/SEED) 256 1024 256 32
IBQ-VQVAE-8192 IBQ (TencentARC/SEED) 256 8192 256 32
IBQ-VQVAE-16384 IBQ (TencentARC/SEED) 256 16384 256 32
IBQ-VQVAE-262144 IBQ (TencentARC/SEED) 256 262144 256 32
MOVQGAN-67M MOVQGAN 4 16384 4 256
MOVQGAN-102M MOVQGAN 4 16384 4 256
MOVQGAN-270M MOVQGAN 4 16384 4 256

Diffusers usage

AutoencoderKL (SD, FLUX, SANA, Qwen, etc.):

from diffusers import AutoencoderKL

vae = AutoencoderKL.from_pretrained(
    "BiliSakura/VAEs",
    subfolder="SDXL-VAE",
)

VQModel (VQ-Diffusion, IBQ, MOVQGAN):

from diffusers import VQModel

vae = VQModel.from_pretrained(
    "BiliSakura/VAEs",
    subfolder="VQDIFFUSION-VQVAE",
)

Notes

  • All models are VAE checkpoints intended for inference use in their corresponding pipelines.
  • Latent channel count is listed to help match with the correct backbone.
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support