Researchers from Baidu, Nanjing and Rutgers University Propose ‘Paint Transformer’ Tool To Predict The Parameters Of A Stroke Set With A Feed Forward Network

299
Source: https://github.com/wzmsltw/PaintTransformer

Painting has been an excellent way for people to record what they perceive or even how they imagine about the world. It’s long been known that painting requires professional knowledge/skills and is not easy for ordinary people, but with computer-aided art creation, many of us can create our own artistic compositions. With the AI era coming upon us in no time at all, natural images can be transformed into artwork via image style transfer or image-to use translation–and you don’t have to know as much about it either.

The way humans paint is vastly different from how computers create images. Using brushes, humans start with a coarse brush and then work their way down to the finest details of an image. Computers typically do not use strokes but rather optimize pixel space until it resembles what we want our painting to look like at its end result. However, this process has some drawbacks since frequently authentic human creations can be lost in translation when using machine-learning algorithms for tasks requiring more nuanced understanding, such as artwork creation. Many researchers are addressing these issues by starting from scratch on stroke-by-stroke paintings, which may prove valuable both artistically and scientifically if they succeed.

Inspired by a recent object detector DETR, researchers from Baidu, Nanjing University and Rutgers University propose this novel ‘Paint Transformer’ to generate painting via predicting parameters of multiple strokes with a feed-forward Transformer. Unlike object detection, stroke predictor lacks annotated data. That’s why researchers came up with a novel self-training pipeline that utilizes synthetically generated images, which is an original and creative idea.

The researchers first create a random background canvas with some streaks of paint. Next, they put together the foreground strokes and render them onto this image to get their target painting. To train for the stroke predictor, it is necessary to minimize how different these two paintings are and on a pixel-by-pixel basis to predict what will happen when new images come along.

The self-trained Paint Transformer is impressive because it shows great generalization capability and can work for arbitrary natural images once trained. Extensive experiments demonstrate that this new feed-forward method can generate better quality paintings at lower cost than existing methods.

The proposed ‘Paint Transformer’ is a novel framework that has the ability to generate paintings from natural images. It does so by predicting parameters of multiple strokes with feed-forward transformation, and it can do this without manual data collection. The researchers propose a self-training pipeline that makes training their paint transformer possible. This model generates pictures in a better tradeoff between artistic abstraction and realism than other methods while maintaining high efficiency.

Paper: https://arxiv.org/pdf/2108.03798.pdf

Github: https://github.com/wzmsltw/PaintTransformer