ARTICLE EIGHT: MY TEXT-TO-IMG JOURNEY - AUTOMATIC1111 EXTENSION INFINITE ZOOM
This is the eighth post in a series of articles I have been writing about text-to-image software. In this series, I will talk about how the technology works, ways you can use it, how to set up your own system to batch out images, technical advice on hardware and software, advice on how you can get the images you want, and touch on some of the legal, ethical, and cultural challenges that we are seeing around this technology. My goal is to keep it practical so that anyone with a little basic computer knowledge can understand and use these articles to add another tool to their art toolbox, as cheaply and practically as possible. In this Eighth post, we will discuss another Automatic1111 extension, Infinite Zoom and tips and tricks to get good results.
General Info on Extensions and Custom scripts:
Extensions allow you to edit and manipulate images through the software, add options to the interface, or just change the way the interface looks. Basically are a more convenient form of user scripts that you can easily install from the UI and all exist in their own folder inside the extensions folder of webUI. It also means you can point and click instead of trying to do everything at the command line. Alternatively you can just copy-paste a directory into extensions. For developing extensions, see Developing extensions. To install custom scripts, place them into the scripts directory and click the Reload custom script button at the bottom in the settings tab. Custom scripts will appear in the lower-left dropdown menu on the txt2img and img2img tabs after being installed. See the link to learn more about custom scripts and manually adding them to folders: https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Custom-Scripts
Adding Extensions through the WebUI:
Before we talk about adding extensions, let's discuss a few points and best practices. Following these will save you a lot of time by avoiding situations where you have to troubleshoot and reset your system.
- INSTALL 1 EXTENSION AT A TIME! - Why? Because if you install one extension at a time, and verify things work between each install, it is MUCH easier to troubleshoot. If an extension causes an issue or has a dependency that causes your system not to work properly, it is easier to address than if you just installed five extensions.
- TEST BASIC FUNCTIONS AFTER EACH EXTENSION INSTALL. - You don’t need to test everything, though more testing is better, but at least make sure you can generate an image and it comes out as expected.
Those two steps can save you hours of troubleshooting when things go wrong.
NOTE: You can not install extensions if you have your system set to listen on the network, or in other word, if you started the webUI with the command “./webui.sh –listen” when running Automatic1111. You will need to stop the system and restart it with just the command “./webui.sh” from the local system. This is a safety precaution so people can’t remotely execute code. After the extension is installed and verified you can stop the services and restart as normal, with the “./webui.sh –listen” command.
Installation Quick Steps for any Extension:
We provided a quick walk through for the general installation steps through the Automatic1111 webUI in article 7 if you have questions on that process. See the link below if you want a reminder. https://eightbuff.blogspot.com/2023/06/article-seven-my-text-to-img-journey.html
To install the Infinite Zoom Extension:
For the Infinite Zoom extension, you can install it through the WebUI by going to tab ‘Extensions’ and the Subtab ‘Available’.
- Click on the button: “Load From” to receive the latest extensions.
- Find ‘infinite-zoom’ and click ‘Install’. Wait.
- Once it is installed it will disappear from the list.
OR
- You can install it through the WebUI by going to tab ‘Extensions’ and the Subtab ‘Install from URL’.
- Enter https://github.com/v8hid/infinite-zoom-automatic1111-webui.git for the URL and leave the second field empty and wait for it to be installed.
- Go to the Installed tab and press Apply, wait for installation, and restart.
Once you are complete with the addition installs, go back to the “Installed” sub-tab and click “Apply and Restart UI”
You can check for updates and additional details on the Github site: https://github.com/v8hid/infinite-zoom-automatic1111-webui
Using Infinite Zoom:
Once you have restarted you will notice the new ‘Infinite Zoom’ tab in Automatic1111. Go to that tab and you will see it is already populated with a default prompt. At this point you could just hit generate and if everything is working fine, you will get a 5 second video of a waterfall in a jungle. Below is a video with all defaults using the 8buff model available on Civitai or on HuggingFace.
Tips and tricks for Infinite Zoom:
Since Infinite Zoom, in basic terms, is doing an outpaint of each image (generating images based off an original image to extend the picture), it requires different prompt areas.
- ‘Common Prompt Prefix’ contains what all prompts will begin with. In the default it is populated with “Huge spectacular Waterfall in”
- The next section provides prompts for the original and outpainting at different time intervals. This is where you want to make changes to different images to your zoom video.
- ‘Common Prompt Suffix’ provides prompts to keep continuity between images as they are drawn.
- ‘Negative Prompt’ is of course the things you want to avoid having displayed in your image.
Starting with a ‘Custom initial image’ helps provide an interesting focal point to your video, but be careful to generate and choose a good image for outpainting. Use the same model and similar prompts to generate an image using txt2img then use similar prompts in the Infinite Zoom prompt field.
Stick with landscapes, tunnels, or other things that lend to a blended view of images. Avoid people and creatures, these images may be cut into sections by the outpaint process.
Make sure your ‘Total Video Length [s]’ field matches or exceeds the maximum ‘Start at second [0,1,...]’ to ensure you are getting the images you want.
The ‘Video’ sub-tab provides additional control for your video.
‘Frames per second’ can be left on default of “30” until you have prompts that work constantly and want to make a high quality video, then bump it up higher.
‘Zoom mode’ provides options to zoom in or out. The zoom in mode seems to follow the same process in creating the images, just placing them backward when creating the video. So if you add an initial image, then the video will end on that image instead of starting on it.
The next two settings ‘number of start frame dupe’ and ‘number of last frame dupe’ allow you to freeze the video at the start or end, for effect.
The ‘Zoom Speed’ allows you to control the zoom speed in the video. I personally like to raise this to 2 or 2.5 to slow the video down and have a less dizzying effect.
The ‘OutPaint’ sub-tab is set to 48 ‘Mask Blur’ and ‘Masked content’ is “latent noise”. I did play around with these settings but did find the default did provide the most consistent results.
The ‘Post Process’ sub-tab and click ‘Enable Upscale’. I recommend using the ‘Upscaler’ “R-ESRGAN 4x+” and ‘Upscale by factor’ of 3.5 to test. This will upscale all the images from the default 512 before it generates the video. If you have a video card over 8GB you can try up-scaling larger.
Play around with small images and video until you get prompts that consistently work for you and your model. Make sure you save your prompts by pressing the ‘Export Prompts’ button to save your prompts to a .json file. Note that this only saves you prompts and not any of the other settings.
When you have good prompts, you can improve the quality of the video by, bumping up the ‘Sampling Steps for each outpaint’ on the ‘Main’ subtab, on the ‘Post Process’ sub-tab click ‘Enable Upscale’
In the screenshots I have provided the best setting I found for the model we used in testing, 8buff. You can also see all our Infinite Zoom videos on our YouTube Channel. https://www.youtube.com/playlist?list=PLGVbK024IpLaCuYpwW3kAwbWbam-KOrU-
https://www.youtube.com/channel/UCvWVColnywSh2WtviATTmCA
If you run into issues, we are happy to help our Pateron supporters with advice and tips: https://patreon.com/EightBuffaloMediaGroup
To summarize, the Infinite Zoom extension is a simple but fun way to add more capabilities to your generative AI capabilities. Thank all the people who manage these open source projects that make your work a little easier. Test your tools and your work though installs. Once everything is working, don’t be afraid to play with the setting and see what you can create. As always, be creative with your prompts, you never know what images the computer may provide that inspire you in new and different ways.
Comments
Post a Comment