Researchers At Fujitsu Use Deep Learning To Develop A Person Re-Identification (ReID) Method (ECANet) That Incorporates Obtained Pair Images Without The ID Labels Into The Training Data

Over the last few years, video surveillance systems have experienced significant technological and economic expansion. In this context, the person re-identification (Re-ID) emerged. Re-ID is used in several areas, such as security and the study of customer behavior. It became widely used thanks to the performance achieved by deep neural networks. However,  the performance decreases for domains other than those in the training data, and its implementation is still difficult in real-world scenarios.

Unsupervised domain-adaptive Re-ID is an approach for domain-adaptive learning that uses unlabeled data. This approach first develops a base model using labeled data, including publicly available datasets. The pre-trained model is then used to infer the labels of the target domain data. The acquired labels, also known as pseudo-labels, are utilized for training the target-domain model. However, by nature, the unsupervised domain-adaptive Re-ID framework’s pseudo-labels contain noise, leading to low performance. Recently, a research team from the Japanese corporation Fujitsu created an environment-constrained adaptive network (ECA-Net) to reduce the pseudo-label noise for the target domain.

In particular, the information from a multi-camera environment is used in the suggested method. In various camera angles, some overlapping zones can be seen. The suggested approach relies on acquiring paired image data of the same identity from an overlapping area in the target domain. ECA-Net follows the technique of the mean teacher, where the mean teacher model uses a temporally moving average of weights of the student network. ECA-Net uses two datasets. The first dataset is considered the source domain. It is used to perform the first training phase in a supervised fashion. The CNN parameters are then adjusted for the second dataset, which represents the target domain, via a self-training approach using the acquired pre-trained model. Clustering the distance matrix data of each person yields pseudo-labels. The authors proposed to apply the k-reciprocal re-ranking to calculate the distance matrix, which uses the Jaccard distance based on the combination of neighboring images. Then, the same person-pair list in the target domain is employed to enhance the performance. Finally, the model adapted to the target domain is trained based on the training data with the refined pseudo-labels.

Since it’s not obvious which CNN feature is most appropriate for each person pair, the authors developed a method based on a graph that efficiently selects the optimal CNN features.

Three public datasets were used in the experiments to evaluate the proposed network: Market-1501, DukeMTMC-reID, and MSMT17. In addition, since the public datasets are not used to assess the performance using the overlap information, the authors decided to evaluate the performance using a private dataset (Shopping mall). Two metrics were used: the mean average precision (mAP) and

cumulative matching characteristic. According to the findings, the performance roughly correlates with the ratio of training data to pair images.

This paper presented a new person Re-Identification method (ECA-Net) for reducing the domain gap. According to the authors of the article, this is the first network on domain-adaptive learning with multi-camera constraints. The suggested method adds obtained pair images to the training data without ID labels. The experiments show that ECa-Net performs better than the most recent state-of-the-art techniques.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'UNSUPERVISED DOMAIN-ADAPTIVE PERSON RE-IDENTIFICATION WITH MULTI-CAMERA CONSTRAINTS'. All Credit For This Research Goes To Researchers on This Project. Check out the paper.
Please Don't Forget To Join Our ML Subreddit

Mahmoud is a PhD researcher in machine learning. He also holds a
bachelor's degree in physical science and a master's degree in
telecommunications and networking systems. His current areas of
research concern computer vision, stock market prediction and deep
learning. He produced several scientific articles about person re-
identification and the study of the robustness and stability of deep
networks.