Meet Diffusion-GAN: A Novel GAN Framework That Leverages A Forward Diffusion Chain To Generate Gaussian-Mixture Distributed Instance Noise

Generative Adversarial Networks (or just GANs) have been widely used to generate synthetic data for different applications in recent years. The most commonly considered domain is computer vision, where GAN can be used, for instance, to generate photo-realistic high-resolution images. One of the main problems of GANs is the instability of their training process. A possible method to stabilize the training procedure of a GAN is to add noise to the input data provided to the discriminator model. This solution extends the data distributions of both the generator and the discriminator. At the same time, it prevents the discriminator from overfitting. Unfortunately, the injection of instance noise is not straightforward since selecting a suitable noise distribution to be applied to data is challenging.

In this paper, researchers from the University of Texas at Austin and Microsoft Azure AI propose Diffusion-GAN, a method that uses a diffusion process to generate Gaussian-mixture distributed instance noise. This seems the first work that empirically shows how instance noise can be used to enhance the training process of a GAN on high-dimensional image data.

Figure 1 shows how Diffusion-GAN adds noise to the images. The diffusion process is applied both to real and generated data, and it consists of a sequence of steps that gradually add noise to every single image. While the vanilla GANs directly compare real and fake images, the discriminator of Diffusion-GAN compares their noisy versions. These noisy versions are sampled by the Gaussian mixture distribution over the diffusion steps. The diffusion process starts from the original image and adds Gaussian noise at each step to gradually erase its information until a certain degree of noise level is reached after T steps. The same diffusion process is used for both real and generated samples.

At each diffusion step, the loss function designed by the authors encourages the discriminator to assign high probabilities to the noisy versions of the real samples and low probabilities to the perturbed versions of the fake images produced by the generator. On the other hand, the generator tries to generate samples to fool the discriminator at each step.

Moreover, the intensity of the noise added by the diffusion process is adjusted during training based on how much the discriminator is able to distinguish real and fake samples. In the beginning, the noise level will be low so that the discriminator will start to distinguish the original images. Over time, the noise is gradually increased to make the training process of the discriminator more challenging, thus mitigating its overfitting.

Overall, the strategy proposed in this paper achieves two benefits. The first is that the training process of the GAN is stabilized by mitigating the vanishing gradient problem. This issue occurs when the original data and the generator distributions are too different. The second is that considering different noisy versions of the same image improves the diversity of the samples produced by the generator.

To conclude, the following figure shows some images generated by the diffusion version of StyleGAN2, considering the FFHQ dataset.

Immagine che contiene testo, persona, posando, inpiedi

Descrizione generata automaticamente

Check out the paper and GitHub link. All Credit For This Research Goes To Researchers on This Project. Also, don’t forget to join our Reddit page and discord channel, where we share the latest AI research news, cool AI projects, and more.

Luca is Ph.D. student at the Department of Computer Science of the University of Milan. His interests are Machine Learning, Data Analysis, IoT, Mobile Programming, and Indoor Positioning. His research currently focuses on Pervasive Computing, Context-awareness, Explainable AI, and Human Activity Recognition in smart environments.