Ultimate Solution to Generate Ultra High Resolution Image using Stable Diffusion WebUI

Introduction

Stable Diffusion WebUI txt2img requires a significant amount of RAM, GRAM, and CPU workload to generate high-resolution images. It is not cost-effective to generate a 4K image in one shot, as some results might be bad or unsatisfactory. I prefer to quickly generate images in low resolution to find good seeds and prompts, then upscale the promising results to 4K in img2img mode.

Challenges

Things are not that simple; we still face many challenges, including performance, denoising strength, and image quality.

Both high res. fix and img2img upscale have performance issues as they all consume a significant amount of RAM and GRAM. The performance issue can be easily solved by using the SD Upscale script to upscale the image by splitting it into tiles. However, a denoising strength value higher than 0.2 might cause the image to split into a grid pattern when using SD Upscale. This is because SD Upscale relies on a process called checkerboard tiling, where the original image is divided into smaller tiles, each of which is then upscaled and stitched back together.

Denoising strength is crucial in upscaling images to high resolution, serving as a vital factor in balancing clarity and accuracy. Increasing denoising strength can add more detail and enhance the overall quality of the image, but it also risks introducing inaccuracies, such as artifacts, twisted characters, loss of fine details, or an unnatural smoothness. Conversely, a lower denoising strength maintains closer fidelity to the original image but may result in an inadequately upscaled image, leading to a blurry appearance with a lack of detail. The optimal denoising level is therefore a careful balance, aimed at enriching detail without compromising the integrity of the original image.

0.2 is not a good value for denoising strength to upscale images; it is too low.

Solution: ControlNet Tile Model

I used to spend a lot of time finding the right balance between denoising strength and image quality, but now I have found a better solution.

My solution involves utilizing the SD Upscale script in tandem with the ControlNet Tile model for img2img upscaling. This approach utilizes the Tile model's ability to intricately divide the original image into overlapping segments. Each segment is then upscaled individually, taking into account the contextual information from its neighboring tiles. This method ensures not only localized enhancement but also smooth transitions between segments. As a result, it significantly mitigates the common grid-like pattern artifacts often seen in upscaled images. The Tile model thus facilitates a more precise and detailed enhancement, leading to a higher quality and more coherent upscaled result.

Enhance Image Quality with FreeU

FreeU(sd-webui-freeu) is an add-on for the Stable Diffusion AI model that enhances image quality by altering the model's denoiser. It works by modifying the U-Net noise predictor in the model, aiming to improve the global composition and fine details of the images. The main functions of FreeU include:

  • Improved Image Details: It sharpens images and provides higher contrast, especially effective with Anime models or realistic painting styles.
  • Adjustable Effects: Users can control the intensity of the modifications through scaling factors, balancing enhanced details against the risk of oversmoothing.

My Workflow

  1. Generate a base image using txt2img mode to search for good seeds and prompts quickly.
    1. Set the image size based on 512px, such as 512x512 or 512x768.
    2. Use a fast sampling method like Euler a to generate images quickly.
    3. Set sampling steps to 10 for quick generation.
    4. Enable the Stable Diffusion OpenPose model if I want to generate images with human figures.
  2. Upscale the good results to 4K using the SD Upscale script and ControlNet Tile model in img2img mode:
    1. Set denoising strength to 0.5; higher denoising strength gets better results but takes longer to generate.
    2. Enable ControlNet and choose the Tile/Blur model.
    3. Enable FreeU to enhance image quality.
    4. Enable SD Upscale to split the image into tiles and upscale them individually.