| | --- |
| | language: en |
| | tags: |
| | - stable-diffusion |
| | - diffusion |
| | - text-to-image |
| | license: other |
| | --- |
| | |
| | # dee-unlearning-tiny-sd |
| |
|
| | **Model family:** Stable Diffusion | **Base:** SG161222/Realistic_Vision_V4.0 (Diffusers 0.19.0.dev0) |
| |
|
| | This repository packages the inference components (VAE, UNet, tokenizer, text encoder, scheduler config) that instantiate a `StableDiffusionPipeline` tuned for lightweight experimentation with deep unlearning ideas. All large binaries are stored under Git LFS (`*.bin` and other model artifact extensions as configured in `.gitattributes`). |
| |
|
| | --- |
| |
|
| | ## Model summary |
| |
|
| | - **Architecture:** `StableDiffusionPipeline` with `UNet2DConditionModel`, `CLIPTextModel`, `AutoencoderKL`, and `DPMSolverMultistepScheduler`. |
| | - **Scheduler:** DPMSolver++ (multistep) configured with `num_train_timesteps=1000`, `steps_offset=1`, and the default `epsilon` prediction type that aligns with the diffusion formulation used in Realistic Vision. |
| | - **Intended behavior:** Generate photorealistic samples guided by text prompts. The “tiny” name reflects a focus on a compact deployment bundle rather than a new generative architecture. |
| |
|
| | ## Usage |
| |
|
| | 1. Install dependencies (tested with `diffusers==0.19.0.dev0`, `transformers`, `torch`, `accelerate`, `safetensors`). |
| | 2. Load the pipeline with the provided components. |
| |
|
| | ```python |
| | from diffusers import StableDiffusionPipeline |
| | from transformers import CLIPTokenizer, CLIPTextModel |
| | from diffusers import UNet2DConditionModel, AutoencoderKL, DPMSolverMultistepScheduler |
| | |
| | pipeline = StableDiffusionPipeline( |
| | text_encoder=CLIPTextModel.from_pretrained("path/to/text_encoder"), |
| | tokenizer=CLIPTokenizer.from_pretrained("path/to/tokenizer"), |
| | unet=UNet2DConditionModel.from_pretrained("path/to/unet"), |
| | vae=AutoencoderKL.from_pretrained("path/to/vae"), |
| | scheduler=DPMSolverMultistepScheduler.from_config("path/to/scheduler"), |
| | ) |
| | pipeline.to("cuda") |
| | prompt = "A cinematic portrait of a futuristic astronaut exploring a coral reef" |
| | with torch.autocast("cuda"): |
| | image = pipeline(prompt, num_inference_steps=25, guidance_scale=7.5).images[0] |
| | ``` |
| |
|
| | Replace each `from_pretrained` call with the relative path inside this repository (e.g., `"text_encoder"`). Exported weights follow the standard Diffusers layout, so you can also load the entire pipeline from disk with `StableDiffusionPipeline.load_from_directory(...)` if you prefer a single root. |
| |
|
| | ## Known limitations |
| |
|
| | - Not evaluated on a public benchmark: quality, bias, and safety metrics are unknown beyond the original Realistic Vision baseline. |
| | - Outputs inherit the biases of the base dataset, which can include underrepresentation of marginalized groups and the tendency to hallucinate architecture or people. |
| | - Prompts that contradict physics, are highly abstract, or request disallowed content may fail or produce unpredictable imagery. |
| | - Fine-tuning past the provided weights may require additional safety mitigations depending on your dataset. |
| |
|
| | ## Opportunities |
| |
|
| | 1. **Research experimentation:** Use this compact bundle to investigate targeted unlearning strategies or dataset pruning without re-downloading massive checkpoints. |
| | 2. **Edge deployment:** Swap in a smaller scheduler or reduce `num_inference_steps` to explore speed/quality trade-offs for on-device sampling. |
| | 3. **Controlled generation:** Attach additional conditioning (CLIP embeddings, ControlNet) to the pipeline for downstream applications such as assistive art tools, conditional rendering, or creative assistants. |
| |
|
| | ## Safety considerations |
| |
|
| | - Follow established safety best practices when generating faces, political imagery, or NSFW prompts; the pipeline does not include a safety checker. |
| | - Monitor outputs for deceptive or fabricated content before deployment in public-facing products. |
| | - Don’t use the model to impersonate real people, create harmful memes, or automate disinformation campaigns. |
| |
|
| | ## Attribution & licensing |
| |
|
| | This work builds on the `SG161222/Realistic_Vision_V4.0` checkpoints and the Diffusers ecosystem. Verify and comply with the upstream license before redistributing or fine-tuning the weights. |
| |
|