Latent-DIffusion-Transformer

This is an implementation of Latent Diffusion Model from scratch.

Usage Guide

pip install -r requirements.txt

Download vocab.json and merges.txt from https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5/tree/main and save them in the data folder in the root directory.
Similarly, also the donwload the pretrained weights v1-5-pruned-emaonly.ckpt from the same link and save it in the data folder.

Modify the prompt variable to specify the image you want to generate.
(Optional) Use uncond_prompt to exclude certain elements from the generated image.

Create an images folder in the root directory.
Add your desired image to this folder.
Set the image_path variable to the relative path of the image.
Uncomment the following line:
input_image = Image.open(image_path)
Optimize Performance If your system supports CUDA (for NVIDIA GPUs) or MPS (for Apple Silicon), enable faster processing by setting the corresponding constant to True: ALLOW_CUDA = True # Enable CUDA for NVIDIA GPUs OR ALLOW_MPS = True # Enable MPS for Apple Silicon (M1/M2)

Prompt: A very high quality image of a scenic view containing mountains and a river flowing between them, 8k resolution.