A computer system known as an image retrieval system is used to browse, search for, and retrieve images from a large database of digital images. Feature extraction is the most crucial aspect of image retrieval. The features match the representation of an image and should also make it possible to retrieve the images effectively. Deep Metric Learning (DML) is a technique used to train a neural network to map input images to a lower-dimensional embedding space so that similar images are closer than dissimilar ones. Unfortunately, DML does not resolve background bias which causes irrelevant feature extraction.
In the literature, two main approaches can be distinguished, which attempt to overcome the problem of background bias: Background Augmentation and Attribution Regularization. Those methods are designed for classification networks and cannot be directly used for DML networks. Using random pictures, background augmentation techniques replace the background of images used for training or inference. In order to determine the regions of a picture that the network concentrates on, attribution regularization computes the attribution map of an input sample during training. A German research team proposes a study aimed at analyzing the influence of the background on through the use of three standard datasets and five common loss functions.
The study conducted in this article aims at two major points:
1) Prove that models learned by DML are not robust against background bias.
2) Propose a data-augmenting technique to remedy the problem cited above.
The dependence of trained DML models on the image background is measured using a new test environment that the authors introduced. They postulate that the more a DML model considers the background of images when creating an embedding, the more the embeddings will vary when the image’s background is changed. Conversely, if the model prioritizes the background, retrieval performance should suffer significantly when the background of test photos is modified randomly in the DML option. They, therefore, proposed to create a new test dataset by combining the region of interest of each image with a background from the famous stock photo website Unsplash. U-Net carries out the detection of the object of interest.
To overcome the background bias in DML, the authors apply a new strategy, BGAugment, to perform data augmentation during training and validation inspired by the literature about background bias in classification networks. They propose to follow the same process performed to create the test dataset. To avoid interfering with the test set’s background images, images selected from Unsplash are different from those used in the test set.
To validate the two postulates mentioned above, an experimental study was carried out to compare three ranking losses: Contrastive Loss, Triplet Loss, and Multi Similarity Loss, in addition to two classification losses, ArcFace Loss and Normalized Softmax Loss. Experiments were performed on three standard benchmark datasets for Deep Metric Learning: Cars196, CUB200, and Stanford Online Products. The results confirm that when a model is trained without BGAugment, its performance drops when faced with test images from the modified test dataset. On the other hand, using the proposed data augmentation improves these results and makes the model more robust against background bias.
In this paper, the authors proved that it is advantageous for retrieval settings like object retrieval or person re-identification systems to investigate and counteract background bias in DML. They claim these are the first to demonstrate how background bias affects DML models. For the trivial image, a novel approach has been proposed to reduce the background bias in DML that doesn’t need more labeling work, model adjustments, or longer inference times.
This Article is written as a research summary article by Marktechpost Staff based on the research paper 'On Background Bias in Deep Metric Learning'. All Credit For This Research Goes To Researchers on This Project. Check out the paper and code.
Please Don't Forget To Join Our ML Subreddit
Mahmoud is a PhD researcher in machine learning. He also holds a
bachelor's degree in physical science and a master's degree in
telecommunications and networking systems. His current areas of
research concern computer vision, stock market prediction and deep
learning. He produced several scientific articles about person re-
identification and the study of the robustness and stability of deep