DemoFusion

Code release for "DemoFusion: Democratising High-Resolution Image Generation With No 💰" (arXiv 2023)

Abstract: High-resolution image generation with Generative Artificial Intelligence (GenAI) has immense potential but, due to the enormous capital investment required for training, it is increasingly centralised to a few large corporations, and hidden behind paywalls. This paper aims to democratise high-resolution GenAI by advancing the frontier of high-resolution generation while remaining accessible to a broad audience. We demonstrate that existing Latent Diffusion Models (LDMs) possess untapped potential for higher-resolution image generation. Our novel DemoFusion framework seamlessly extends open-source GenAI models, employing Progressive Upscaling, Skip Residual, and Dilated Sampling mechanisms to achieve higher-resolution image generation. The progressive nature of DemoFusion requires more passes, but the intermediate results can serve as "previews", facilitating rapid prompt iteration.

News

2023.12.08: 🚀 A HuggingFace Demo for Img2Img is now available! Thank Radamés for the implementation and for the support!
2023.12.07: 🚀 Add Colab demo . Check it out! Thank camenduru for the implementation!
2023.12.06: 🚀 Local Gradio Demo is now available! Better interaction and presentation!
2023.12.04: ✨ A low-vram version of DemoFusion is available! Thank klimaleksus for the implementation!
2023.12.01: 🚀 Integrated to Replicate. Check out the online demo: Thank Luis C. for the implementation!
2023.11.29: 💰 pipeline_demofusion_sdxl is released.

Usage

A quick try with integrated demos

HuggingFace Space: Try Text2Image generation at and Image2Image enhancement at .
Colab: Try Text2Image generation at and Image2Image enhancement at .
Replicate: Try Text2Image generation at and Image2Image enhancement at .
⚠️ For Image2Image enhancement, please note that DemoFusion's capability is strongly correlated with the SDXL's training data distribution and will show a significant bias. Have fun and regard this "enhancement" as a side application of text+image based generation.

Starting with our code

Running the default setting in the paper (will take about 17 GB of VRAM)

Set up the dependencies as:

conda create -n demofusion python=3.9
conda activate demofusion
pip install -r requirements.txt

Download pipeline_demofusion_sdxl.py and run it as follows. A use case can be found in demo.ipynb.

from pipeline_demofusion_sdxl import DemoFusionSDXLPipeline

model_ckpt = "stabilityai/stable-diffusion-xl-base-1.0"
pipe = DemoFusionSDXLPipeline.from_pretrained(model_ckpt, torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "Envision a portrait of an elderly woman, her face a canvas of time, framed by a headscarf with muted tones of rust and cream. Her eyes, blue like faded denim. Her attire, simple yet dignified."
negative_prompt = "blurry, ugly, duplicate, poorly drawn, deformed, mosaic"

images = pipe(prompt, negative_prompt=negative_prompt,
              height=3072, width=3072, view_batch_size=16, stride=64,
              num_inference_steps=50, guidance_scale=7.5,
              cosine_scale_1=3, cosine_scale_2=1, cosine_scale_3=1, sigma=0.8,
              multi_decoder=True, show_image=True
             )

for i, image in enumerate(images):
    image.save('image_' + str(i) + '.png')

⚠️ When you have enough VRAM (e.g., generating 2048*2048 images on hardware with more than 18GB RAM), you can set multi_decoder=False, which can make the decoding process faster.
Please feel free to try different prompts and resolutions.
Default hyper-parameters are recommended, but they may not be optimal for all cases. For specific impacts of each hyper-parameter, please refer to Appendix C in the DemoFusion paper.
The code was cleaned before the release. If you encounter any issues, please contact us.

Running on Windows with 8 GB of VRAM

Set up the environment as:

cmd
git clone "https://github.com/PRIS-CV/DemoFusion"
cd DemoFusion
python -m venv venv
venv\Scripts\activate
pip install -U "xformers==0.0.22.post7+cu118" --index-url https://download.pytorch.org/whl/cu118
pip install "diffusers==0.21.4" "matplotlib==3.8.2" "transformers==4.35.2" "accelerate==0.25.0"

Launch DemoFusion as follows. The use case can be found in demo_lowvram.py.

python
from pipeline_demofusion_sdxl import DemoFusionSDXLPipeline

import torch
from diffusers.models import AutoencoderKL
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)

model_ckpt = "stabilityai/stable-diffusion-xl-base-1.0"
pipe = DemoFusionSDXLPipeline.from_pretrained(model_ckpt, torch_dtype=torch.float16, vae=vae)
pipe = pipe.to("cuda")

prompt = "Envision a portrait of an elderly woman, her face a canvas of time, framed by a headscarf with muted tones of rust and cream. Her eyes, blue like faded denim. Her attire, simple yet dignified."
negative_prompt = "blurry, ugly, duplicate, poorly drawn, deformed, mosaic"

images = pipe(prompt, negative_prompt=negative_prompt,
              height=2048, width=2048, view_batch_size=4, stride=64,
              num_inference_steps=40, guidance_scale=7.5,
              cosine_scale_1=3, cosine_scale_2=1, cosine_scale_3=1, sigma=0.8,
              multi_decoder=True, show_image=False, lowvram=True
             )

for i, image in enumerate(images):
    image.save('image_' + str(i) + '.png')

Running with Gradio demo

Make sure you have installed gradio and gradio_imageslider.
Launch DemoFusion via Gradio demo now -- try python gradio_demo.py! Better Interaction and Presentation！

Citation

If you find this paper useful in your research, please consider citing:

@article{du2023demofusion,
  title={DemoFusion: Democratising High-Resolution Image Generation With No $$$},
  author={Du, Ruoyi and Chang, Dongliang and Hospedales, Timothy and Song, Yi-Zhe and Ma, Zhanyu},
  journal={arXiv preprint arXiv:2311.16973},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
figures		figures
README.md		README.md
demo.ipynb		demo.ipynb
demo_lowvram.py		demo_lowvram.py
gradio_demo.py		gradio_demo.py
output_example.png		output_example.png
pipeline_demofusion_sdxl.py		pipeline_demofusion_sdxl.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DemoFusion

News

Usage

A quick try with integrated demos

Starting with our code

Running the default setting in the paper (will take about 17 GB of VRAM)

Running on Windows with 8 GB of VRAM

Running with Gradio demo

Citation

About

Releases

Packages

Languages

GieniuW/DemoFusion

Folders and files

Latest commit

History

Repository files navigation

DemoFusion

News

Usage

A quick try with integrated demos

Starting with our code

Running the default setting in the paper (will take about 17 GB of VRAM)

Running on Windows with 8 GB of VRAM

Running with Gradio demo

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages