Enhance Details & Quality: Upscaling into Super-Resolution with ESRGAN


CogniWerk Editor

Date: 17.10.2023

ESRGAN, which stands for Enhanced Super-Resolution Generative Adversarial Networks, is a deep learning-based approach for image super-resolution, which means increasing the resolution of low-resolution images while maintaining or even enhancing their details and quality. In this blogpost, I will compare outputs to inspect the quality and show you when and how I use ESRGAN.

What is ESRGAN?

ESRGAN uses a generative adversarial network (GAN) to achieve this goal. A GAN is a type of neural network that consists of two sub-networks: a generator network and a discriminator network. The generator network creates high-resolution images from low-resolution ones, while the discriminator network tries to distinguish between real high-resolution images and fake ones generated by the generator network.

During training, the two networks compete against each other, with the generator network trying to produce images that can fool the discriminator network into thinking they are real high-resolution images. As a result, the generator network learns to produce increasingly better high-resolution images over time.

ESRGAN is an enhanced version of the original SRGAN (Super-Resolution Generative Adversarial Networks) approach, which uses a simpler generator network with fewer layers. ESRGAN introduces a novel architecture that includes residual blocks, dense connections, and a feature fusion module to improve the quality and sharpness of the generated images.

Improve the quality of Images

There are multiple ways to use an upscaler, however I find it very handy that there’s a dedicated tab in the AUTOMATIC 1111 WebUI and fits smoothly into a workflow pipeline. Even when generating output via text2img or img2img, there is a useful button at the bottom with which you can send the output image into the extra-tab, which is the build-in upscaler with ESRGAN. There are also many external services and other options to use an AI-powered upscaler.

In these examples, I used the 4x as the main upscaler model. You can also experiment with others and even combine them while having control over the weights.

Blog Esrgan1 Blog Esrgan2
Blog Esrgan3 Blog Esrgan4
Blog Esrgan5 Blog Esrgan6

Overall, ESRGAN is a powerful AI tool for image super-resolution that is useful in a variety of applications. With this as the last step in the image generation pipeline, I tend to generate my very first image in a low resolution like 512x512 so I can get more output in fewer time. And when the img2img and inpainting process is done, I upscale the images in the concluding step of the workflow.