File size: 3,566 Bytes
fd2ffe0 84396d9 fd2ffe0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 |
---
license: apache-2.0
library_name: diffusers
pipeline_tag: image-to-image
base_model: shallowdream204/BitDance-Tokenizer
language:
- en
tags:
- bitdance
- tokenizer
- autoencoder
- custom-architecture
- diffusers
---
# BitDance-Tokenizer (Diffusers)
Diffusers-formatted BitDance tokenizer autoencoders (AE) converted from the upstream BitDance tokenizer checkpoints.
## Available Autoencoders
- `ae_d16c32` (`z_channels=32`, `gan_decoder=false`)
- `ae_d32c128` (`z_channels=128`, `gan_decoder=true`)
- `ae_d32c256` (`z_channels=256`, `gan_decoder=true`)
Each subfolder includes:
- `config.json` with the autoencoder architecture
- `conversion_metadata.json` documenting the source checkpoint and config
## Test (load tokenizer only)
This repo is self-contained: it includes `bitdance_diffusers` (copied from BitDance-14B-64x-diffusers) for the `BitDanceAutoencoder` class. Run the test to verify loading and encode/decode:
The test loads all three autoencoders and runs a quick encode/decode check with `ae_d16c32` (no full image generation).
## Loading tokenizer autoencoders
```python
import sys
from pathlib import Path
# Self-contained: add local path so bitdance_diffusers is found
BASE_DIR = Path(__file__).resolve().parent
sys.path.insert(0, str(BASE_DIR))
from bitdance_diffusers import BitDanceAutoencoder
# Load any tokenizer autoencoder (use repo path or local path)
ae = BitDanceAutoencoder.from_pretrained(
"BiliSakura/BitDance-Tokenizer-diffusers", # or str(BASE_DIR) for local
subfolder="ae_d16c32",
)
# ae_d16c32: z_channels=32, patch_size=16
# ae_d32c128: z_channels=128, patch_size=32
# ae_d32c256: z_channels=256, patch_size=32
```
## Using with a BitDance pipeline (full inference)
To swap a tokenizer into a BitDance diffusers pipeline for image generation:
```python
import torch
from diffusers import DiffusionPipeline
# Load a BitDance diffusers pipeline first (provides BitDanceAutoencoder class).
pipe = DiffusionPipeline.from_pretrained(
"BiliSakura/BitDance-14B-16x-diffusers",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
).to("cuda")
# Swap in a tokenizer autoencoder from this repository.
pipe.autoencoder = pipe.autoencoder.__class__.from_pretrained(
"BiliSakura/BitDance-Tokenizer-diffusers",
subfolder="ae_d16c32",
).to("cuda")
image = pipe(
prompt="A watercolor painting of a red fox in a snowy forest.",
height=1024,
width=1024,
).images[0]
image.save("bitdance_with_custom_tokenizer.png")
```
> Note: this repository stores tokenizer autoencoder components; use `trust_remote_code=True` with a BitDance runtime repo when loading custom classes.
## Citation
If you use this model, please cite BitDance and Diffusers:
```bibtex
@article{ai2026bitdance,
title = {BitDance: Scaling Autoregressive Generative Models with Binary Tokens},
author = {Ai, Yuang and Han, Jiaming and Zhuang, Shaobin and Hu, Xuefeng and Yang, Ziyan and Yang, Zhenheng and Huang, Huaibo and Yue, Xiangyu and Chen, Hao},
journal = {arXiv preprint arXiv:2602.14041},
year = {2026}
}
@inproceedings{von-platen-etal-2022-diffusers,
title = {Diffusers: State-of-the-art diffusion models},
author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Damar Jablonski and Hernan Bischof and Thomas Wolf},
booktitle = {GitHub repository},
year = {2022},
url = {https://github.com/huggingface/diffusers}
}
```
## License
This repository is distributed under the Apache-2.0 license, consistent with the upstream BitDance release.
|