Deep multimodal representation learning for generalizable person re-identification

摘要

Person re-identification plays a significant role in realistic scenarios due to its various applications in public security and video surveillance. Recently, leveraging the supervised or semi-unsupervised learning paradigms, which benefits from the large-scale datasets and strong computing performance, has achieved a competitive performance on a specific target domain. However, when Re-ID models are directly deployed in a new domain without target samples, they always suffer from considerable performance degradation and poor domain generalization. To address this challenge, we propose a Deep Multimodal Representation Learning network to elaborate rich semantic knowledge for assisting in representation learning during the pre-training. Importantly, a multimodal representation learning strategy is introduced to translate the features of different modalities into the common space, which can significantly

出版物
Machine Learning
向孙程
向孙程
2017级博士生
冉苇
冉苇
2019级博士生
于泽芳
于泽芳
2018级博士生
刘婷
刘婷
讲师
付宇卓
付宇卓
教授 博士生导师