MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection

1Johns Hopkins University, 2Mercedes-Benz Research and Development India

Abstract

Existing approaches for unsupervised domain adaptive object detection perform feature alignment via adversarial training. While these methods achieve reasonable improvements in performance, they typically perform category-agnostic domain alignment, thereby resulting in negative transfer of features. To overcome this issue, in this work, we attempt to incorporate category information into the domain adaptation process by proposing Memory Guided Attention for Category-Aware Domain Adaptation (MeGA-CDA). The proposed method consists of employing category-wise discriminators to ensure category-aware feature alignment for learning domain-invariant discriminative features. However, since the category information is not available for the target samples, we propose to generate memory-guided category-specific attention maps which are then used to route the features appropriately to the corresponding category discriminator. The proposed method is evaluated on several benchmark datasets and is shown to outperform existing approaches.

Method

Overview of the proposed framework
Source and target features are aligned through global domain adaptation and categoryaware domain adaptation. Category-aware alignment is achieved by employing K category-specific discriminators. Since target labels are unavailable, the features to these discriminators are routed using memory-guided category-specific attention maps. Note that the arrows indicate the flow of source and target images during the training process.

Memory block

A memory module has two operations, namely write and read. To write in to the memory, features extracted from the neural network are used to update the memory elements appropriately. Whereas, the memory read operation is used by the features extracted from the neural network to query the memory and retrieve the most similar memory element (or prototypical feature).

Category-Aware Attention

For determining the attention at a particular location, we use the feature at this location as a query to retrieve relevant items from the different category-specific memory networks. The retrieved items are then compared with the query item and based on the similarity, we compute the category-specific attention map. Furthermore, in order to improve the effectiveness of the memory module and the attention map generation process, we propose a metric-learning based approach that involves learning an appropriate similarity metric based on the available weaksupervision in the source domain.

Results

Qualitative detection results. Global alignment results in miss-detections. In contrast, the proposed approach reduces false- positives while achieving high-quality detections.

Comparison of attention maps computed using cosine similarity (top-row) and learned similarity based attention (bottom-row). Though cosine similarity based provides reasonable focus on category features, learned similarity obtains more accurate attention.

BibTeX


   @inproceedings{vs2021mega,
     title={Mega-cda: Memory guided attention for category-aware unsupervised domain adaptive object detection},
     author={Vs, Vibashan and Gupta, Vikram and Oza, Poojan and Sindagi, Vishwanath A and Patel, Vishal M},
     booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
     pages={4516--4526},
     year={2021}
   }
Website Template taken from Nerfies