Researchers Propose a Novel Framework ‘FaceMAE’, Where the Face Privacy and Recognition Performance are Considered Simultaneously

This Article is written as a summay by Marktechpost Staff based on the Research Paper 'FaceMAE: Privacy-Preserving Face Recognition
via Masked Autoencoders'. All Credit For This Research Goes To The Researchers of This Project. Check out the paper and Github.

Please Don't Forget To Join Our ML Subreddit

Face recognition has made significant and consistent progress in improving recognition accuracy, and it is now widely employed in everyday activities such as online payment and identification security. Face recognition algorithms that are more advanced and large-scale public face databases are two major components of this improvement. 

In recent years, worries about the privacy leakage of identified membership and features of training samples have grown as a result of gathering and distributing large-scale face datasets. The face recognition community has an important and difficult task in generating large-scale privacy-preserving face databases for downstream tasks.

Face photographs have typically been subjected to distortions such as blurring, noising, and masking in order to increase privacy. These basic distortion approaches degrade the privacy and semantics of a face image, resulting in unsatisfactory recognition results. Recent research attempts to synthesize identity-ambiguous faces, de-identification faces, or attribute-removed faces with modified generative adversarial networks in order to create a privacy-preserving face dataset.

Although these GAN-based approaches can generate realistic-looking faces, their utility in training deep models is not assured. The domain gap between generated and original faces is significant, resulting in poor face recognition performance for GAN-based systems. Another disadvantage of GAN-based approaches is that generators that have been trained on a single dataset cannot be used on unknown identities. Furthermore, all raw face photos from the target dataset must be used to train the generative models, posing additional privacy problems.

In a recent study, InsightFace researchers proposed FaceMAE, a revolutionary framework that considers both face privacy and recognition performance at the same time. FaceMAE is divided into two stages, namely training and deployment. Researchers used masked autoencoders (MAE) to reconstruct a fresh dataset from a randomly masked face dataset during the training stage. The goal of FaceMAE is to generate faces that are useful for facial recognition training.

Source: https://arxiv.org/pdf/2205.11090v1.pdf

The instance relation matching module (IRM) was used by the researchers to reduce the disparity between the relation graphs of the original and recreated faces. Rather than inheriting the MSE-Loss for MAE, the team was the first to create a separate optimization object. After completing the training stage, the team used the trained FaceMAE to produce a reconstructed dataset and train the facial recognition backbone on it.

FaceMAE outperforms state-of-the-art synthetic face dataset generation approaches by a wide margin in terms of recognition accuracy in numerous large-scale face datasets, according to studies. FaceMAE significantly reduced the error rate of the runner-up approach by at least 50% when trained on reconstructed images from 75% masked faces of CASIA-WebFace. It was discovered that the strategy created more useful privacy-preserving facial images.

Conclusion

In a recent study, InsightFace researchers tackled the pressing and difficult problem of privacy-preserving facial recognition. FaceMAE was offered as a method for creating synthetic samples that minimize privacy leakage while maintaining recognition performance. The instance relation matching module was created to allow generated samples to be used to efficiently train deep models. The trials showed that the suggested MAE outperforms the runner-up method by lowering recognition error by at least 50% on a popular face dataset. When FaceMAE was used, the danger of privacy leakage was reduced by about 20%.