Image content is an integral part of marketing campaigns, websites, banners. Designers use tools like Adobe Photoshop to get the desired and compelling marketing content. Yet, they spend a lot of time in the creation process by selecting, editing, polishing, and compositing raw images. Researchers at Adobe propose using Generative Adversarial Networks (GANs), a novel conditioning strategy for simplifying content creation.
Neural networks such as GANs are widely used due to their ability to generate new images similar to those they are previously trained on. However, these networks use randomly sampled latent vectors to create images. This does not allow them to control the semantic attributes of the pictures. With conditional GANs, users can customize the desired features, but it requires re-training the network with the conditional adversarial loss for the choice of attributes. Also, re-training and re-designing would be needed while adding new attributes to the network.
In their recent paper,’ Directional GAN: A Novel Conditioning Strategy for Generative Networks,’ the team introduces novel Directional GAN (DGAN). DGAN is a simple approach that generates high-resolution images conditioned on desired semantic attributes. This method separates the process of generation from that of conditioning. Separating the two functions allows them to be trained independently and thus simplifies the complex task of conditional generation.
The team used a generator (GAN) to generate a realistic image from a random latent vector. Learning linear hyperplanes in the latent space that separate the attribute values allows them to move this latent vector appropriately in the latent space to achieve conditioning in the generated image.
The researchers used the below methods to validate their approach:
- They generated full-body human images that could condition the style of clothing and the pose on MPV and Deepfashion.
- They enabled them to condition facial attributes such as hair color, gender, and degree of the smile on the CelebA-HQ face dataset.
On evaluating the proposed approach, the researchers remark that using DGAN for conditional image generation achieves around 89 percent accuracy in conditioning on gender, about 78 percent accuracy in conditioning on hair color. For the degree of the smile, the approach achieves a low root mean square error (RMSE) as low as 0.134.
DGAN outperforms the state-of-the-art approaches in generating high-resolution full-body human images. It also allows conditioning on various binary, multi-class, and continuous-valued attributes. Additionally, DGAN provides excellent control over attributes in the generation process. This helps accelerate and improve the image creation experience.
The team states that moving the latent vector to the desired subspace by using convex optimization without requiring image attributes to be disentangled has a lot of scope for future work. They suggest using DGAN could be extended to modify authentic images by first inverting the given actual image back to a latent vector using an optimization-based inversion approach or an encoder-based approach. This latent vector could be later modified appropriately using DGAN.
Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.