A New Artificial Intelligence (AI) Benchmark Called DeepPrivacy2 Provides Realistic Anonymization of Human Faces and Whole-Body

Many applications require collecting personally identifiable information, making image collection and storage commonplace. Recently enacted legislation in many jurisdictions makes it difficult to acquire such data without anonymization or individual authorization. 

Blurring images is a common method of traditional image anonymization. But it badly distorts the data, rendering it useless for other purposes. Generative models can now generate realistic faces suitable for a specific situation, which has led to the introduction of realistic anonymization. Although present approaches aim to hide a person’s identity, they only succeed in making their faces unrecognizable to primary and secondary identifiers.

Using dense pixel-to-surface correspondences derived from Continuous Surface Embeddings (CSE), Surface Guided GANs (SG-GAN) offer a full-body anonymization GAN. However, this approach is prone to visual aberrations that degrade image quality. According to researchers, the dataset is a modification of COCO comprising 40K human figures, which is the reason behind the poor visual quality. The CSE segmentation used for anonymization also does not account for hair or other body accessories; thus, the anonymized person frequently “wears” them nevertheless. Furthermore, SG-GAN fails to anonymize many people since the CSE detector typically misses people who are off-camera.

A new study by the Norwegian University of Science and Technology extends upon Surface Guided GANs to deal with the low visual quality and insufficient anonymization caused by inadequate segmentation. They introduce the Flickr Diverse Humans (FDH) dataset, a subset of the YFCC100M dataset, containing 1.5M photos of human beings in various settings. They demonstrate that the higher visual quality of created human figures directly results from the larger dataset. As a second step, they offer a unique anonymization framework that uses a combination of detections across modalities to boost human figure segmentation and detection.

The researchers have used separate anonymizers in their framework for:

  1. Human figures detected by dense pose estimation
  2. Human figures that CSE does not detect
  3. All other faces 

The proposed approach uses a basic inpainting GAN for each class, trained using conventional methods for GANs. The study’s results show that the proposed GAN can produce high-quality, diversified identities with minimal modeling adjustments tailored to the job. They applied their GAN for face anonymization on a revised Flickr Diverse Faces (FDF) dataset. Because the GAN doesn’t rely on position guidance, it can anonymize people even when pose information is hard to detect, significantly improving over earlier face anonymization methods. 

The team also demonstrates that the style-based generator can use techniques from unconditional GANs to locate globally semantically relevant directions in the GAN latent space. Therefore, the suggested anonymization pipeline can now accept edits to attributes based on textual guidance. 

DeepPrivacy2 outperforms all prior state-of-the-art realistic anonymization approaches in terms of image quality and anonymization assurances. The accuracy of the DeepPrivacy2 synthesis has been verified by using both qualitative and quantitative analysis. Since there is no accepted benchmark against anonymization methods, the team compares their results to the widely used face anonymization method DeepPrivacy and those of Surface Guided GANs for whole-body anonymization (SG-GANs). The FDH dataset is used for training the whole-body anonymization generator, while the FDF256 dataset is used for training the face anonymization generator; the FDF256 dataset is an updated version of the FDF. In addition, they also incorporate evaluation data from Market1501, Cityscapes, and COCO.

For a wide range of scenes, poses, and overlaps, the results show that DeepPrivacy2 produces high-quality figures. The Unconditional Full-Body Generator, which does not employ CSE, reveals that it is also necessary for high-quality anonymization with its somewhat unnatural arms and legs.

The team hopes that their open-source framework will serve as a valuable resource for organizations and individuals in need of anonymization while maintaining image quality, particularly those working in the field of computer vision.

Check out the Paper and GitHub. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 13k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.

🚀 LLMWare Launches SLIMs: Small Specialized Function-Calling Models for Multi-Step Automation [Check out all the models]