Princeton Student Researcher Proposes Sketch-And-Paint GAN (SAPGAN): A GAN Framework For Chinese Landscape Painting Generation

The emergence and constant refining of Generative Adversarial Networks (GANs) have generated impressive artworks in various styles through AI. A Princeton undergrad student, Alice Xue, has recently designed a GAN framework for Chinese landscape painting generation that is so effective that its work can’t be distinguished from the real thing.

According to this paper, the proposed framework, namely, Sketch-And-Paint GAN (SAPGAN), is the first end-to-end model for Chinese landscape painting generation without conditional input. Around 242 participants in a visual Turing test identified SAPGAN paintings as human artworks with a substantially better frequency than artwork from baseline GANs. Xue explains that popular GAN-based art generation methods like style transfer rely too heavily on conditional inputs. However, models dependent on conditional input have limited generative capability as their images are built on a single, human-fed input. This dependence implies that they can only produce derivative artworks that are nothing but stylistic copies of the conditional input.

Hence, Xue further proposes that a model not reliant on conditional input could generate an infinite set of paintings seeded from latent space. Along with style, its output content would also be varied artistically through this end-to-end creation process.

This method tries to mimic the “sketch-and-paint” process of traditional Chinese landscape painters. Thus, SAPGAN is designed with two stages: 

  • SketchGAN component: This is for the generation of edge maps
  • PaintGAN component: This is for subsequent edge-to-painting translation. 

To improve SketchGAN’s training, Xue curated a new dataset of approximately 2k high-quality traditional Chinese landscape paintings sourced from museum collections.

Compared with RaLSGAN and StyleGAN2, the proposed SAPGAN model was judged as performing better in both realism and artistic composition. In the Visual Turing Test, the human evaluators looked at 18 paintings — six each from SAPGAN, human painters, and the baseline model RaLSGAN. It was found that the SAPGAN paintings were chosen as human-produced 55 percent of the time. In contrast, the baseline RaLSGAN model’s generations could manage a fooling frequency rate of hardly 11 percent.

https://arxiv.org/pdf/2011.05552.pdf

Xue believes that the research can help lay the groundwork for truly machine-original art generation. She says that the model is not confined to only Chinese paintings and can be generalized to other artistic styles that emphasize edge definition.

Paper: https://arxiv.org/pdf/2011.05552.pdf

Github: https://github.com/alicex2020/Chinese-Landscape-Painting-Dataset

Shilpi is a Contributor to Marktechpost.com. She is currently pursuing her third year of B.Tech in computer science and engineering from IIT Bhubaneswar. She has a keen interest in exploring latest technologies. She likes to write about different domains and learn about their real life applications.

↗ Step by Step Tutorial on 'How to Build LLM Apps that can See Hear Speak'