Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
piercus 
posted an update Oct 21
Post
1847
🚧 Reproducing LBM-Eraser… in progress! [1]

When repurposing a T2I model into a pure I2I model, there’s always that orphaned text path — what do we do with it? 🤔

You can reuse it as learnable embeddings in multi-task setups [2], freeze an empty text prompt, distillate or prune the corresponding part.

In LBM, they take a clever route — zeroing [3] and reshaping [4] the text-related cross-attentions into self-attentions.
This gives you fresh weights for I2I computation, nicely integrated into your SD architecture.

📎 References
[1] Our LBM Fork: https://github.com/finegrain-ai/LBM
[2] OmniPaint: OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting (2503.08677)
[3] LBM Zeroing: https://github.com/gojasper/LBM/blob/cafebc46a9ac16dcc61691d289cc4676b5c75380/examples/training/train_lbm_surface.py#L147-L148
[4] LBM Reshaping: https://github.com/gojasper/LBM/blob/cafebc46a9ac16dcc61691d289cc4676b5c75380/examples/training/train_lbm_surface.py#L100
This comment has been hidden

zeroing and reshaping the text-related cross-attentions into self-attentions

It's actually narrowing, not zeroing (even though strategy="zeros" is used in the StateDictAdapter()).

For instance, the logs show:

Adapting down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_k.weight by narrowing from shape torch.Size([320, 768]) to torch.Size([320, 320])

So the extra weights are just discarded in this case. Zero-filling is only used when expanding tensors to larger shapes.

Corresponding code: link.