Unsupervised Domain Adaptation (UDA) is an effective approach to tackle the issue of domain shift. Specifically, UDA methods try to align the source and target representations to improve generalization on the target domain. Further, UDA methods work under the assumption that the source data is accessible during the adaptation process. However, in real-world scenarios, the labelled source data is often restricted due to privacy regulations, data transmission constraints, or proprietary data concerns. The Source-Free Domain Adaptation (SFDA) setting aims to alleviate these concerns by adapting a source-trained model for the target domain without requiring access to the source data. In this paper, we explore the SFDA setting for the task of adaptive object detection. To this end, we propose a novel training strategy for adapting a source-trained object detector to the target domain without source data. More precisely, we design a novel contrastive loss to enhance the target representations by exploiting the objects relations for a given target domain input. These object instance relations are modelled using an Instance Relation Graph (IRG) network, which are then used to guide the contrastive representation learning. In addition, we utilize a student-teacher based knowledge distillation strategy to avoid overfitting to the noisy pseudo-labels generated by the source-trained model. Extensive experiments on multiple object detection benchmark datasets show that the proposed approach is able to efficiently adapt source-trained object detectors to the target domain, outperforming previous state-of-the-art domain adaptive detection methods.
Supervised training of detection model on the source domain. Right: Source-Free Domain Adaptation (SFDA) setup, i.e., the source-trained model is adapted to the target domain in the absence of source data with pseudo-label self- training and proposed Instance Relation Graph (IRG) network guided contrastive loss.
(a) Class agnostic object proposals generated by Region Proposal Network (RPN). (b) Cropping out RPN proposals will provide multiple contrastive views of an object instance. We utilize this to improve target domain feature representations through RPN-view contrastive learning. However as RPN proposals are class agnostic, it is challenging to form positive (same class)/negative pairs (different class), which is essential for CRL.
We follow a student-teacher framework for the detector model training. The proposed Instance Relation Graph (IRG) network models the relation between the object proposals generated by the detector. Using the inter-proposal relations learned by IRG, we generate pairwise labels to identify positive/negative pairs for contrastive learning. The IRG network is regularized with distillation loss between student and teacher model.
Relation matrix analysis for 25 proposal RoI features before and after passing through IRG network and corresponding masked instance pairwise labels. We can ob- serve the IRG network models the relationship between the proposal, which maximizes the similarity between similar proposals and vice versa for dissimilar proposals.
@inproceedings{vs2023instance,
title={Instance relation graph guided source-free domain adaptive object detection},
author={VS, Vibashan and Oza, Poojan and Patel, Vishal M},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={3520--3530},
year={2023}
}