Automatic face recognition is being widely adopted by private and governmental organizations worldwide for various legitimate and beneficial purposes, such as improving security. However, it is not incorrect to say that its increasing influence has a potential negative impact that unfair methods can have on society (such as discrimination against ethnic minorities). An essential condition for a legitimate deployment of face recognition algorithms is equal accuracy for all demographic groups.
The researchers from the Human Pose Recovery and Behavior Analysis Group at the Computer Vision Center (CVC) and the University of Barcelona (UB) organized a challenge in 2020 within the European Conference of Computer Vision (ECCV). The study results evaluate the accuracy and bias in gender and skin color of automatic face recognition algorithms tested with real-world data.
Sergio Escalera, who led the researchers’ team, states that Around 151 people participated in the challenge, and more than 1,800 submissions were received. The participants used a dataset of non-balanced image dataset, which resembles the real-world scenario where AI-based models should be trained and judged on imbalanced data. In all, the participants worked with 152,917 images from 6,139 identities in total.
The images were annotated for:
- Protected attributes: Gender and Skin color
- Legimitate attributes: Age group (0-34, 35-64, 65+), head pose, image source (still image, video frame), wearing glasses, and bounding box size.
Julio C. S. Jacques Jr., a researcher at the CVC and the Open University of Catalonia, states that the received outcomes were quite positive. Top winning algorithms exceeded 99.9% of accuracy while achieving very low scores in the proposed bias metrics. He says that this can be considered a step toward developing fairer face recognition methods. Even though the top solutions exceed the 99.9% accuracy, the team has detected some groups with higher false positives or false negatives rates.
On analyzing the top team’s solutions, it was observed that they showed higher false-positive rates for females with dark skin tone and samples where both individuals wear glasses. In contrast, the researchers noticed higher false-negative rates in males with light skin tone and examples of individuals aged under 35. They realized that individuals aged 35 and below wear glasses less often than older individuals, resulting in a combination of these attributes’ effects. This is no surprise as the datasets used were not balanced concerning the different demographic features.
Julio C. S. Jacques Jr stated that resulting overall accuracy is not adequate to build fair face recognition methods, and therefore the future works on the topic must take into account accuracy and bias mitigation together.