ARTICLE NINE: MY TEXT-TO-IMG JOURNEY

July 11, 2023

ARTICLE NINE: MY TEXT-TO-IMG JOURNEY - COMFYUI AND SDXL 0.9

This is the eighth post in a series of articles I have been writing about text-to-image software. In this series, I will talk about how the technology works, ways you can use it, how to set up your own system to batch out images, technical advice on hardware and software, advice on how you can get the images you want, and touch on some of the legal, ethical, and cultural challenges that we are seeing around this technology. My goal is to keep it practical so that anyone with a little basic computer knowledge can understand and use these articles to add another tool to their art toolbox, as cheaply and practically as possible. In this Ninth post, we will discuss another User Interface (UI) for txt-to-image models, called ComfyUI, and use the newest Stable Diffusion Model SDXL 0.9 in ComfyUI.

More About SDXL

Stable Diffusion has released SDXL 0.9 for limited testing. At the time of this writing you can request access to be able to download the models. We will not be digging heavily into the new SDXL since the full 1.0 version will be out soon and I am sure there will be many articles about the changes and improvements. If you want to read more details see: https://stability.ai/blog/sdxl-09-stable-diffusion

To download the SDXL files go to Huggingface and register to access all the SDXL 0.9 files: https://huggingface.co/stabilityai

ComfyUI

ComfyUI is currently one of the OpenSource projects that support Stable diffusion models. At this time, Automatic1111 does not support SDXL models. You could look at some other UIs that do, like VLAD (a UI based off Automatic1111) but I thought this was an excellent opportunity to test out ComfyUI because of its focus on workflow. It also never hurts to have a few options when it comes to open source software, since you can never be sure when a program will become another unmaintained project.

The ComfyUI interface will let you design and execute advanced stable diffusion pipelines using a graph/nodes/flowchart based interface. This will a bit of a learning curve, but can also provide an excellent interface to automate workflows for SDXL, 2.1, and 1.5 SD models. It gets better. You don’t need to start from scratch! There are several workflows shared on sites, including CIVITAI. One of those we shared for this tutorial. Go ahead take a look and download our workflow: https://civitai.com/models/107451/8buff-node-for-comfyui-sdxl-09-testing

Install ComfyUI

Installing ComfyUI is relatively straightforward and documented on the Github site: https://github.com/comfyanonymous/ComfyUI

From Link Above:

“Windows

There is a portable standalone build for Windows that should work for running on Nvidia GPUs or for running on your CPU only on the releases page.

Direct link to download

Simply download, extract with 7-Zip and run. Make sure you put your Stable Diffusion checkpoints/models (the huge ckpt/safetensors files) in: ComfyUI\models\checkpoints”

You can skip down to “Running ComfyUI” if you install the portable Standalone above. Continue reading if you have GIT and Python installed on your Windows systems and want to do a manual install.

There are a few options to install on Windows manually, Linux, Mac, and to run on CoLab or Paperspace. If you have another UI installed, like Automatic1111, you can share your models between the two by adjusting a config file. See the Config file to set the search paths for models. In the standalone windows build you can find this file in the ComfyUI directory. Rename this file to extra_model_paths.yaml and edit it with your favorite text editor.

We did a manual install on Windows, since we already have GIT and Python installed, and have provided detailed steps below.

Run the git command to pull the ComfyUI files:

git clone https://github.com/comfyanonymous/ComfyUI.git

Find the folder, run the following command to install and run dependencies:

pip install -r requirements.txt

Nvidia users should install torch and xformers using this command:

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118 xformers

Running ComfyUI

Before running we need to download and install the safetensor files.

Put your SD checkpoints, the huge 13GB safetensors files from https://huggingface.co/stabilityai into the folder: ComfyUI/models/checkpoints

You will need the following models:

Sd_xl_base_0.9.safetensors

sd_xl_refiner_0.9 safetensors

We also recommend pulling the VAE file and placing into: ComfyUI/models/vae

Starting Up

To start ComfyUI after installing the Standalone version, double click on the following BAT file:

run_nvidia_gpu.bat

This should open a browser window for you.

If you did the manually install, from command line run:

Python main.py

Either way, if you enter http://127.0.0.1:8188 in your browser you should see something similar to the following.

Now is the time to load the JSON file for the workflow we used for testing. Download the file https://civitai.com/models/107451/8buff-node-for-comfyui-sdxl-09-testing and unzip the file. That should provide a file called “8Buff_COMFYUI_ workflow_SDXL0.9.json” that is preconfigured for SDXL files and prompts.

Drag the JSON file to your ComfyUI browser window. The config file should load something similar to the following.

You can go ahead and click “Queue Prompt” on the right side of your screen.

The first time you run the workflow will read and verify both the large SDXL safetensor files. This will take some time, almost 450 seconds for me the first time. Be patient.

After that first time most images are generated in under 30 seconds, using the defaults.

The “8Buff_COMFYUI_ workflow_SDXL0.9.json” file provides the basic settings you need to test different models and feed them through refiners and output images of just the Base model and what the Refiner processed. This will provide you a good starting point to test ComfyUI out and maybe download other nodes and workflows in the future.

SDXL 0.9 Observations

The new model is an improvement in many ways, in our opinion. Some quick observations about the positives:

Generating 1080x1080 images through the Base AND the Refiner takes about the same time with 1.5 or 2.1 for 768x768. It seems much faster.
The refiner can make huge improvements for the final image quality, including hand and eye issues common in previous models.
Color, lighting, and detail seem to be overall improved when going for realistic images.

That said, we have some concerns:

The size of these files are huge, and training will require large images if the process remains the same so that to get consistent results you should generate images the same size the model was trained on. This will require more testing.
The new SDXL model seems to demand a workflow with a refiner for best results. Some initial testing with other 1.5 and 2.1 models showed that the refiner was not backward compatible. Again, this will need more testing.
There were times when we liked the Base image more, and the refiner introduced problems.

Overall, the new SDXL model is very promising. Using SDXL with ComfyUI made us look at some other ways we can test and improve our workflows and start to share and test others' workflows. You can check out customComfyUI node workflows on Civitai:
https://civitai.com/tag/custom%20node

To summarize, the SDXL model is an improvement but we don’t think it is anywhere close to reaching its peak generative AI capabilities, yet. Like when the first versions of 1.5 and 2.1 came out, the output had room for improvement. The SDXL version is obviously steps ahead, but I think we will have to wait a couple months to see its true potential. That potential seems to come out in the communities that train, adjust weights, and mix models to achieve what we are all looking for, creative expression at scale.

Thank all the people who manage these open source projects that make your work a little easier.

You can subscribe to our new YouTube Channel as we continue to explore video creation in the future!
https://www.youtube.com/channel/UCvWVColnywSh2WtviATTmCA

If you run into issues, we are happy to help our Pateron supporters with advice and tips: https://patreon.com/EightBuffaloMediaGroup

Search This Blog

Eight Buffalo Media Group Blog