This article is based on the research paper 'Any-resolution Training for High-resolution Image Synthesis'. All credit for this research goes to the researchers of this paper 👏👏👏 Please don't forget to join our ML Subreddit
In machine learning, generative modeling is an unsupervised learning task that automatically detects and learns regularities or patterns in input data. The model may produce new examples that could have been drawn from the original dataset.
In most generative modeling pipelines, the initial step is to create a dataset with a fixed goal resolution. Images with a resolution higher than the objective are downsampled. Furthermore, data with the insufficient resolution is removed, resulting in the loss of structural information at low frequencies. The process of resolving the problem wastes potentially useful data.
Relaxing the fixed-dataset assumption opens up new possibilities by allowing researchers to understand both the overall structure and fine-scale characteristics of image data simultaneously. By combining higher-resolution images with already existing fixed-size datasets, better-resolution images can be generated at greater resolutions than previously achievable.
However, there are many modeling and scalability issues with this problem. The images must first be created at several resolutions to compare the target distribution, as naively downsampling the full-resolution image is inefficient. Second, because modern training architectures cannot fully utilize high-resolution images, creating and processing high-resolution images poses scalability issues.
Adobe researchers recently published a study in which they created a generator to synthesize images at arbitrary scales to overcome these difficulties, resulting in any-resolution training. To design the generator, they modified the state-of-the-art architecture and drew inspiration from recent work in coordinate-based conditioning. The generator can produce patches of the same image at different scales, allowing the team to generate at any scale quickly.
The team tested the generator extensively on a variety of datasets and found minimal degradation even at drastically skewed distributions, implying that the team could use large-scale lower-resolution datasets and a small number of high-resolution images to achieve continuous-resolution synthesis beyond the current 1024 resolution limit of modern generators.
Adobe researchers recently proposed an image synthesis method that can train on images of varying resolutions. The method can infer at continuous resolutions, overcoming the fixed-resolution requirement of previous generative models. The team discovered that by using a model for continuous-resolution synthesis, they could effectively use information included in only a few high-resolution images to complement a huge set of low-resolution images. Unlike earlier research, this method allows for high-resolution synthesis without the need for a larger training generator or a huge dataset of fixed-size, high-resolution images.