NVIDIA has developed a universal PyTorch library, Imaginaire, with an optimized implementation of various GAN images and video synthesis.
The Imaginaire library currently covers three types of models, providing tutorials for each of them:
- Supervised Image-to-image translation
- Unsupervised Image-to-image translation
- Video-to-video translation
Imaginaire utilizes different algorithms depending on the model type, including Coco-funit, SPADE/ GauGan, Multimodal Unsupervised Image-to-image translation, etc.
One of the projects developed using Imaginaire is Coco-Funit. It was trained using NVIDIA DGX1 with 8 V100 32GB GPUs. It transforms the input style in the form of content to produce the image-to-image translation.
Supervised Image-to-Image Translation
|pix2pixHD||Learn a mapping that converts a semantic image to a high-resolution photorealistic image.||Wang et. al. CVPR 2018|
|SPADE||Improve pix2pixHD on handling diverse input labels and delivering better output quality.||Park et. al. CVPR 2019|
Unsupervised Image-to-Image Translation
|UNIT||Learn a one-to-one mapping between two visual domains.||Liu et. al. NeurIPS 2017|
|MUNIT||Learn a many-to-many mapping between two visual domains.||Huang et. al. ECCV 2018|
|FUNIT||Learn a style-guided image translation model that can generate translations in unseen domains.||Liu et. al. ICCV 2019|
|COCO-FUNIT||Improve FUNIT with a content-conditioned style encoding scheme for style code computation.||Saito et. al. ECCV 2020|
|vid2vid||Learn a mapping that converts a semantic video to a photorealistic video.||Wang et. al. NeurIPS 2018|
|fs-vid2vid||Learn a subject-agnostic mapping that converts a semantic video and an example image to a photoreslitic video.||Wang et. al. NeurIPS 2019|
|wc-vid2vid||Improve vid2vid on view consistency and long-term consistency.||Mallya et. al. ECCV 2020|