YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
IT-Blender

Imagine for Me: Creative Conceptual Blending of Real Images and Text via Blended Attention
Wonwoong Cho1,
Yanxia Zhang2,
Yan-Ying Chen2
David Inouye1
1Elmore Family School of Electrical and Computer Engineering, Purdue University
2Toyota Research Institute
Features
IT-Blender is a T2I diffusion adapter that can automate the blending process of visual and textual concepts to enhance human creativity.
- Preserving detailed visual concepts from a reference image: We leverage the denoising network (both UNet-based and DiT-based) as an image encoder to maintain the details of visual concepts.
- Disentangling textual and visual concepts: We design a novel Blended Attention on top of the image self-attention module, where textual concepts are physically separated, encouraging disentanglement of textual and visual concepts.
Pretrained Models
Model | Base model | Description | Resolution |
---|---|---|---|
IT-Blender FLUX |
FLUX.1-dev | The model used in the paper. 1.43 GB. | (512, 512) |
IT-Blender StableDiffusion |
SD 1.5 | The model used in the paper. 99.1 MB. | (512, 512) |
License
This project is licensed under the Purdue University.
See the LICENSE file for full license terms.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support