Obtaining datasets that include thorough labeling of sensitive attributes is difficult, especially in the domain of computer vision. Recently, Google has introduced the More Inclusive Annotations for People (MIAP) dataset in their Open Images Extended collection.
The collection consists of more complete bounding box annotations for the person class hierarchy in 100k images containing people. In addition, every annotation is labeled with fairness-related attributes, consisting of perceived gender presentation and perceived age range. As the focus is increasingly shifting on reducing unfair bias as part of responsible AI research, Google aims to encourage researchers already leveraging Open Images to include Annotations in Open Images.
Every image in the original Open Images dataset consists of image-level annotations that widely describe the image and bounding boxes drawn around particular objects. In addition, less specific classes were temporarily pruned from the label candidate set to avoid drawing multiple boxes around the same object, a process known as hierarchical de-duplication.
The MIAP dataset mainly has five classes that are part of the person hierarchy in the original Open Images dataset: man, woman, boy, girl, person. These labels make the Open Images dataset valuable for research advancing responsible AI, thereby allowing users to train the general person detector with access to gender and range labels for bias mitigation and fairness analysis.
However, it was found that the combination of hierarchical de-duplication and societally imposed distinctions between woman/girl and man/boy introduced a few limitations in the original annotations. Therefore, the bounding box annotations in some images were incomplete, with some people who appeared prominently not being annotated.
The new MIAP annotations are designed to overcome these limitations. Instead of asking annotators to draw boxes for the most specific class from the hierarchy, google has inverted the procedure. Thus, it always requests bounding boxes for the gender- and age-agnostic person class. Then, all person boxes are separately associated with labels for perceived gender presentation and age presentation.
The number of person bounding boxes also has significantly increased from ~358k to ~454k. These new annotations provide a complete ground scenario for training a person detector.
Google has included annotations for perceived age range and gender presentation for person bounding boxes. The labels capture the age and gender range presentation analyzed by a third party based on visual cues alone, rather than an individual’s self-identified gender or actual age.