
In the realm of Unsupervised Domain Adaptation (UDA), adversarial learning has achieved significant progress. Existing adversarial UDA methods typically employ additional discriminators and feature extractors to engage in a max-min game. However, these methods often fail to effectively utilize the predicted discriminative information, thus resulting in the mode collapse of the generator. In this paper, we propose a Dynamic Balance-based Domain Adaptation (DBDA) method for self-correlated domain adaptive image classification. Instead of adding extra discriminators, we repurpose the classifier as a discriminator and introduce a dynamic balancing learning approach. This approach ensures an explicit domain alignment and category distinction, thus enabling DBDA to fully leverage the predicted discriminative information for an effective feature alignment. We conducted experiments on multiple datasets, therefore demonstrating that the proposed method maintains a robust classification performance across various scenarios.
Citation: Hui Jiang, Di Wu, Xing Wei, Wenhao Jiang, Xiongbo Qing. Discriminator-free adversarial domain adaptation with information balance[J]. Electronic Research Archive, 2025, 33(1): 210-230. doi: 10.3934/era.2025011
[1] | Jicheng Li, Beibei Liu, Hao-Tian Wu, Yongjian Hu, Chang-Tsun Li . Jointly learning and training: using style diversification to improve domain generalization for deepfake detection. Electronic Research Archive, 2024, 32(3): 1973-1997. doi: 10.3934/era.2024090 |
[2] | Qing Tian, Canyu Sun . Structure preserved ordinal unsupervised domain adaptation. Electronic Research Archive, 2024, 32(11): 6338-6363. doi: 10.3934/era.2024295 |
[3] | Bingsheng Li, Na Li, Jianmin Ren, Xupeng Guo, Chao Liu, Hao Wang, Qingwu Li . Enhanced spectral attention and adaptive spatial learning guided network for hyperspectral and LiDAR classification. Electronic Research Archive, 2024, 32(7): 4218-4236. doi: 10.3934/era.2024190 |
[4] | Huixia Liu, Zhihong Qin . Deep quantization network with visual-semantic alignment for zero-shot image retrieval. Electronic Research Archive, 2023, 31(7): 4232-4247. doi: 10.3934/era.2023215 |
[5] | Junjie Zhao, Junfeng Wu, James Msughter Adeke, Guangjie Liu, Yuewei Dai . EITGAN: A Transformation-based Network for recovering adversarial examples. Electronic Research Archive, 2023, 31(11): 6634-6656. doi: 10.3934/era.2023335 |
[6] | Qing Tian, Yanan Zhu, Yao Cheng, Chuang Ma, Meng Cao . Unsupervised domain adaptation through transferring both the source-knowledge and target-relatedness simultaneously. Electronic Research Archive, 2023, 31(2): 1170-1194. doi: 10.3934/era.2023060 |
[7] | Yu Wang . Bi-shifting semantic auto-encoder for zero-shot learning. Electronic Research Archive, 2022, 30(1): 140-167. doi: 10.3934/era.2022008 |
[8] | Jian Liu, Zhen Yu, Wenyu Guo . The 3D-aware image synthesis of prohibited items in the X-ray security inspection by stylized generative radiance fields. Electronic Research Archive, 2024, 32(3): 1801-1821. doi: 10.3934/era.2024082 |
[9] | Qing Tian, Heng Zhang, Shiyu Xia, Heng Xu, Chuang Ma . Cross-view learning with scatters and manifold exploitation in geodesic space. Electronic Research Archive, 2023, 31(9): 5425-5441. doi: 10.3934/era.2023275 |
[10] | Ziqing Yang, Ruiping Niu, Miaomiao Chen, Hongen Jia, Shengli Li . Adaptive fractional physical information neural network based on PQI scheme for solving time-fractional partial differential equations. Electronic Research Archive, 2024, 32(4): 2699-2727. doi: 10.3934/era.2024122 |
In the realm of Unsupervised Domain Adaptation (UDA), adversarial learning has achieved significant progress. Existing adversarial UDA methods typically employ additional discriminators and feature extractors to engage in a max-min game. However, these methods often fail to effectively utilize the predicted discriminative information, thus resulting in the mode collapse of the generator. In this paper, we propose a Dynamic Balance-based Domain Adaptation (DBDA) method for self-correlated domain adaptive image classification. Instead of adding extra discriminators, we repurpose the classifier as a discriminator and introduce a dynamic balancing learning approach. This approach ensures an explicit domain alignment and category distinction, thus enabling DBDA to fully leverage the predicted discriminative information for an effective feature alignment. We conducted experiments on multiple datasets, therefore demonstrating that the proposed method maintains a robust classification performance across various scenarios.
Abbreviations: UDA:Unsupervised Domain Adaptation; MMD: Maximum Mean Discrepancy; ND: Normalized Distance; DBDA: Dynamic Balance-based Domain Adaptation; GANs: Generative Adversarial Networks; MCD: Maximum Classifier Discrepancy; CGDM: Cross-Gradient Difference Minimization; GRL: Gradient Reversal Layer; LDA: Linear Discriminant Analysis.
In recent years, multitude methods in computer vision tasks have been explored based on deep neural networks. Nevertheless, traditional deep learning-based image classification techniques necessitate a vast amount of labeled data for training. Acquiring these data and annotations is both time-consuming and costly, thus making it impractical for various scenarios [1,2]. Additionally, the effectiveness of most classification models relies on the assumption that the distributions of the training data (source domain) and test data (target domain) are either identical or similar, terefore leading to a significant drop in performance when the model is tested.
To address this challenge, research teams have delved into Unsupervised Domain Adaptation (UDA)[3,4]. UDA aims to extract domain-invariant features from both the source and target domains, thereby transferring knowledge from the labeled source domain to the unlabeled target domain, which experiences a domain shift, thereby training a model with robust generalization capabilities[2,5,6].
Initially proposed by many researchers, UDA methods should explore representations that learn domain-invariant knowledge[7,8]. In recent years, several different approaches have emerged based on this idea, with some widely recognized directions including the following: domain discrepancy measurement-based methods[5,9], contrastive learning-based methods[10,11], and adversarial learning-based methods[12,13], all of which have achieved notable success. Domain discrepancy measurement-based methods aim to minimize discrepancy metrics, such as the Maximum Mean Discrepancy (MMD) method. Contrastive learning-based methods align feature representations by comparing the representation vectors of samples from the source and target domains, thereby using a contrastive loss to bring positive sample pairs closer and to push negative sample pairs apart. Adversarial learning-based methods typically use adversarial training between generators and discriminators to implicitly derive the domain-invariant features[13,14,15,16,17].
Adversarial Unsupervised Domain Adaptation (UDA) methods leverage an automatic adversarial mechanism between a generator and a discriminator to effectively minimize inter-domain discrepancies. These approaches not only reduce the interference of manually designed factors, but also flexibly adapt to various neural network architectures and tasks[12,18,19]. Building on these advantages, we conducted a systematic study of existing adversarial UDA methods to explore their performance in real-world tasks and to identify potential areas for improvement.
In our study, we identified two primary design paradigms in task-oriented adversarial training. The first involves establishing two independent discriminators, where each discriminator adversarially interacts with the generator on the same image sample, thereby focusing on category-level and domain-level alignment, respectively. By comparing the outputs of the two discriminators, this approach can partially mitigate the class-level domain discrepancies during the transfer process. However, it is susceptible to blurred predictions, which complicate the generator's optimization and may lead to local optima during training.
Alternatively, using a single discriminator to classify the domain of the sample primarily emphasizes domain-level feature alignment. While this method performs well in capturing the overall domain characteristics, it is less sensitive to class-level information, which may degrade the category differentiation. Over a prolonged training, the generator is prone to issues such as mode collapse, thus ultimately resulting in suboptimal transfer performance[8,20].
Moreover, we observed that existing methods often overlook the issue of a sample imbalance between the source and target domains. During training, the domain with a larger number of samples tends to dominate, thus introducing a bias into the model. This bias can weaken the ability to achieve a domain alignment and may even lead to a negative transfer, therefore significantly affecting the model's performance and generalization capabilities.
To address the issues present in adversarial methods, we explored a different approach by directly reusing the classifier as a discriminator. Simultaneously, we introduced a new domain bias quantification method, called te Normalized Distance (ND), thus enabling the classifier to focus on both the domain-level and the class-level information.
To further address the issue of a domain sample imbalance, we introduced a dynamic balancing learning method that adjusts the sample weighting strategy based on differences in a sample quantities and variations in the class discriminability. Our approach updates the weight allocation in real time during training, thus enabling the model to progressively achieve a balanced attention to both the source and target domain samples over an iterative training. This prevents the bias or negative transfer issues caused by imbalanced data distributions.
Based on the aforementioned exploration, we propose a Dynamic Balance-based Domain Adaptation (DBDA) method for the self-correlated domain adaptive image classification. Our model achieves adversarial UDA training by reusing the classifier.
This algorithm dynamically adjusts the training weights based on the alignment status between the source and target domains, thus effectively handling the distribution discrepancies and the sample imbalance.Our method aligns the domain-level information while preserving the key class discriminative features, thus preventing information loss during feature transfer. The classifier reuse strategy reduces the complexity of the training process, accelerates the model convergence, and significantly enhances the model's ability to jointly represent the domain and class features in a simple yet effective manner.
Additionally, our method employs a dynamic balancing algorithm to achieve a balanced state between the domain alignment and the class alignment, thus ensuring a high accuracy for both. During the transfer process, our method effectively balances the attention between the inter-domain distribution differences and the class discriminative ability. As a result, it trains a high-performance classifier with a strong generalization capability, thus providing a simple yet efficient solution for domain adaptation tasks. We conducted extensive experiments on multiple datasets, and the results demonstrate that the proposed method outperforms several state-of-the-art approaches.
The main contributions of our research are summarized as follows:
● We propose an adversarial training method that reuses the classifier as the discriminator, thus eliminating the need for additional discriminators and significantly reducing the training complexity and computational overhead. By reusing the classifier, the model can efficiently leverage the classifier's prediction information, thereby reducing information redundancy and potential biases during the domain alignment process.
● Our approach demonstrates an excellent performance in aligning intra-class and inter-class correlations in the target domain through classifier reuse. This leads to the distribution of target domain samples that gradually approach that of the source domain. At the same time, it efficiently captures the class discriminative features, thus achieving the joint alignment and optimization of both the domain-level and class-level features.
● We introduce a dynamic balancing mechanism that adjusts the model's weight distribution between the classification and adversarial tasks in real-time based on the alignment progress during training in both the source and target domains. This strategy effectively balances the requirements of the domain alignment and the class alignment, thus preventing situations where one task achieves a high accuracy while the other underperforms, thereby further enhancing the model's overall performance and transfer robustness. The code and datasets are available at https://github.com/Jwriter-2000/Domain-Adaptation-Dataset.
As a branch of transfer learning, UDA has seen rapid development in recent years[1,2]. UDA aims to leverage partially annotated source domain data and unlabeled target domain data to build a model with a generalization capability[4,5,21]. Existing methods broadly fall into several directions: domain discrepancy measurement-based learning methods (e.g., Maximum Mean Discrepancy)[5], contrastive learning-based methods[10,11], and adversarial-based methods[13,18]. These approaches aim to obtain feature representations with domain-invariant properties to reduce cross-domain distribution shifts.
Methods based on difference measurements primarily align two domains by quantifying the domain discrepancies, often employing MMD as a metric. Long et al. proposed minimizing the multi-kernel MMD between two domains along with classification prediction errors, thus learning abstract representations of features at different levels to align the domains[21]. Meanwhile, Ren et al. designed a Conditional Kernel Bures distance based on conditional distribution differences to offer an interpretable transfer method[22].
Contrastive methods create positive and negative sample pairs to construct contrastive loss, thereby aiming to bring similar samples from different domains closer for feature alignment[10]. Research by Shen et al. demonstrated that contrastive learning effectively decomposed information at both domain and class levels, thus enabling the transfer of knowledge from source to target domains even with significant domain differences[23]. The method proposed by CDA integrated both contrastive loss and MMD to narrow the gap between the source and target domains, thus achieving notable accuracy improvements[24]. Furthermore, methods based on disentangled representation learning decompose complex data representations into independent and meaningful latent factors [25,26,27,28,29]. Their strong robustness and transferability provide significant insights for research on UDA tasks.
Adversarial methods, inspired by Generative Adversarial Networks (GANs), aim to minimize the difference between the source and target domains through adversarial games[30]. These methods typically consist of a generative model G, which learns to capture the data distribution, and an additional discriminator model D, which is tasked with distinguishing whether samples belong to the source domain or the target domain. Ganin et al. first introduced adversarial methods to domain adaptation tasks with the Domain Adversarial Neural Network (DANN), thus showcasing GAN's strong transfer capabilities in UDA tasks[12]. Subsequent research, as proposed by Long et al., integrated learned features with discriminator predictions to align features across different domains[31].
In addition to methods that used an additional discriminator, some teams have explored integrating two task-specific classifiers as discriminators, thereby using the difference in their outputs to guide adversarial training. Maximum Classifier Discrepancy (MCD) initially used the L1 distance to represent discrepancies between two classifiers' outputs[5]. Meanwhile, other teams have investigated various methods to measure discrepancies, thereby proposing a series of valuable approaches. Cross-Gradient Difference Minimization (CGDM) adopts a cross-domain gradient difference to alleviate the domain discrepancies[9].
Besides the aforementioned methods, knowledge distillation based on teacher-student models is also one of the research hotspots. For example, [32] proposed a multi-view latent space learning framework that achieved reliable pseudo-labeling through a multi-view contrast. [33] fused the similarity relationships predicted by different teacher networks as supervision to optimize the student network, which involved more sample relationships, ultimately achieving domain-invariant knowledge transfer. Meanwhile, graph learning-based methods [34] also show considerable potential in UDA tasks.
The first type of adversarial training method typically relies on additional discriminators, which not only increases the complexity and computational cost of model training, but may also lead to biases in the discriminator when the data distributions between the source and target domains are inconsistent, thereby affecting the effectiveness of the domain alignment. On the other hand, methods without specific discriminators, while simplifying the structure, often focus only on within-class differences and lack a global control over the inter-class and inter-domain differences, thus making it easier to fall into suboptimal solutions. Moreover, these methods often excessively pursue a complete alignment of source and target domain features, thus neglecting the potential distributional differences and sample imbalance between the two. This oversight can lead to a domain misalignment during the domain adaptation process, thus reducing the model's transferability and classification performance. To address these issues, our model proposes a classifier-reuse adversarial method, thus establishing a new dynamic balance discrepancy measurement algorithm. This approach considers both domain and class information, thereby dynamically weighting them based on an alignment during training, which ultimately results in a well-performing classifier.
Unlike traditional GAN models, methods based on Discriminator-free approaches do not require the explicit design or training of a discriminator to distinguish between generated data and real data. These methods reduce the complexity of the model training process by eliminating the discriminator and avoid common issues in the GAN training, such as the imbalance between the discriminator and the generator. Singh1[35] proposed a method that fit a two-component Gaussian mixture model for source and target predictions, where the resulting Gaussian distribution was used to define an adversarial loss based on Frechet distance. Chen[36] demonstrated that Discriminator-free methods are also effective in well-aligned multimodal images. Discriminator-free Adversarial Learning is an important research direction in adversarial domains, especially with the potential for improved efficiency and stability.
Dynamic Balancing adjusts the system parameters in real time based on the training progress or certain specific metrics, thus enabling the model to balance the relationships between different tasks, objectives, or data during training. Many existing studies have incorporated Dynamic Balancing into UDA tasks from various perspectives, such as [38]. In adversarial training, balancing the training of the generator and discriminator is crucial, and dynamic balancing methods can adjust the training strategies between the generator and discriminator to ensure the overall stability of the model. Our method updates the weight distribution in real time during training, thus enabling the model to progressively focus on balancing the source and target domain samples during iterative training, thus avoiding issues such as domain shift or negative transfer caused by data distribution imbalance.
Our method's overall framework is illustrated in Figure 1. We use a pre-trained source domain model to initialize the training process.
To address the issue of differing sample sizes between the two domains, we first weight the samples according to their quantities to reduce errors caused by an imbalanced spatial distribution. Then, the weighted data is mapped into the feature generator G and the classifier C, which is specifically designed for the classification task, thus generating the corresponding predictions. Subsequently, the classifier is combined with the Normalized Distance (ND) and used as a discriminator. The model consists of a pre-trained ResNet-based generator G and a classifier C constructed from fully connected layers and a softmax layer. During the forward propagation, the Gradient Reversal Layer (GRL) remains inactive. However, during backpropagation, the GRL multiplies the gradient by a negative scalar to reverse the gradient, thus maximizing the domain discrimination loss. This process enables the learning of the domain-invariant features and completes the transfer classification task, thus allowing the module to automatically update its parameters. Finally, we developed a dynamic balancing parameter that adjusts the weights of the domain alignment loss and class discriminability loss, thus achieving a dynamic balance between the domain alignment and the class alignment. This ensures that both aspects are equally prioritized and aligned at multiple levels.
First, let's define that we have a source domain containing NS labeled data samples DS={(xsi,ysi)}NSi=1, where xsi∈Xs, ysi∈Ys, with labels covering k categories. Additionally, we have an unlabeled target domain sample Dt={(xit)}Nti=1, where xti∈Xt. During training, when the number of image samples in one domain significantly exceeds that in the other, the imbalance can lead to a higher attention and bias in the model, thus resulting in negative transfer issues. To address this, we reweight the two domains based on sample quantities to balance the weights. The reweighting method is as follows:
˜xsi=λns+ntnsxsi,i=1,2...ns,˜xsj=λns+ntntxsj,j=1,2...nt, | (3.1) |
where ns and nt represent the number of samples in the source and target domains, respectively, and λ∈(0,1] is a hyperparameter used to dynamically control the sample weights. After reweighting, the samples from both domains are fed into the feature generator for the next stage of training.
Inspired by GANs, adversarial learning-based UDA methods typically require an additional discriminator G(⋅) and a generator D(⋅) for adversarial training. The generator extracts features fs=G(xs) and ft=G(xt) from the samples, and the discriminator D(⋅) outputs the corresponding predictions. Then, the classifier C⋅) outputs the predicted probabilities. The loss functions are usually defined as follows:
Lcls=E(xsi,ysi)∼DsLce(C(G(xsi)),ysi), | (3.2) |
Ladν=EG(xsi)∼˜Dslog[D(G(xsi))]+EG(x′i)∼˜Dtlog[1−D(G(x′i))], | (3.3) |
where Lce is the classification loss, and ˜Ds and ˜Dt represent the distribution of generated features for DS and DS, respectively. Common adversarial learning-based UDA methods can be categorized into two types based on whether an additional discriminator is used. Since the original task-specific classifier can implicitly discriminate between the source and target domains, we opt to reuse the classifier C as the discriminator D.
We start by training on labeled samples from the source domain, where supervised training significantly enhances the classification accuracy, thus enabling the classifier to provide correct answers during the prediction. Conversely, during unsupervised training on the unlabeled target domain, the classifier tends to produce incorrect predictions. We construct a self-correlation matrix that includes true labels, predicted labels, and a prediction accuracy. This matrix serves to reflect the intra-class and inter-class correlations in the data. During training, we utilize the differences in the self-correlation matrices between the source and target domains to construct an evaluator to optimize adversarial domain adaptation methods. It is observed that for a dataset with h samples and k classes, the classifier C's prediction matrix Z=C(f)∈Sh×k for the samples can be expressed as follows:
k∑j=1Zi,j=1∀i∈1,2...h, | (3.4) |
Zi,j≥0∀i∈i...h,j∈1...k. | (3.5) |
Therefore, the self-correlation association matrix is defined as R=ZTZ∈Sk×k. For the self-correlation matrix R, its main diagonal elements are used to compute the intra-class correlations, while off-diagonal elements are used to compute the inter-class correlations. To achieve domain alignment, we attempt to bring the differences in the target domain closer to those in the source domain. We define the overall intra-class correlation as Ia and inter-class correlation as Ib:
Ia=k∑i≠jRij, | (3.6) |
Ib=k∑i≠jRij. | (3.7) |
For the source domain, predictions tend to increase the value of Ia and decrease the value of Ib. Conversely, in the target domain, due to the lack of supervised training, Ia tends to decrease while Ib increases. Therefore, the domain differences can be represented by La−Le. According to equation (3.4), the domain differences are defined as La+Le=h, where La are the Frobenius norms of the prediction matrices for the source and target domains, respectively. Thus, we have ΔM=‖Ps‖F−‖Pt‖F. Since Z is predicted by the classifier C, 2‖C‖F−h can be used as the correlation assessment function. Furthermore, considering that the weights and samples size h are constants, we can directly use ‖C‖F as the critical function for the correlation assessment.
To attempt to balance the importance of the source and target domain data, we introduce a K-Lipschitz evaluation function H. This function aims to assign high scores when evaluating the representations of the source data f∈Ds and low scores when evaluating the representations of the target data f∈Dt, while also considering the 1-Wasserstein distance W1(˜Ds,˜Dt) between the two feature distributions ˜Ds,˜Dt:
W1(˜Ds,˜Dt)=sup‖H‖L≤PQf∼˜Ds[H(f)]−Qf∼˜Dt[H(f)], | (3.8) |
where ‖∙‖L denotes the Lipschitz semi-norm [2], and P represents the Lipschitz constant. However, as mentioned above, ‖C‖F can effectively serve as a critic function for the discriminator D. This function can be expressed as follows:
WF(˜Ds,˜Dt)=sup‖C‖F≤PQ˜Ds[‖C(f)‖F]−Q˜Dt[‖C(f)‖F]. | (3.9) |
Therefore, by employing this approach, the domain alignment and class distinction can be simultaneously achieved under a unified objective. This method enables a better utilization of the discriminative information from predictions to capture the multimodal structures within the feature distribution.
However, adversarial learning based on the Frobenius norm 1-Wasserstein distance tends to shift categories with a small number of samples towards adjacent categories with a large number of samples, rather than aligning them along the decision boundary. This results in a lack of diversity in the predictions. Inspired by the nuclear norm, maximizing the nuclear norm ‖Z‖∗, which implies maximizing the rank of Z when the Lipschitz semi-norm ‖∙‖L is nearby, maintains the diversity of predictions. Therefore, we use the nuclear norm instead of the Frobenius norm:
WH(˜Ds,˜Dt)=sup‖C‖∗≤PQ˜Ds[‖C(f)‖∗]−Q˜Dt[‖C(f)‖∗]. | (3.10) |
The discriminator can be expressed as D=‖C‖∗. In this model, there is no separate discriminator; the classifier performs the task instead. This helps achieve a feature-level alignment when the classifier is used for classification, thus facilitating a class-level alignment. Since the classifier acts as an implicit discriminator and its components satisfy the P-Lipschitz constraint, additional weight clipping and gradient penalty strategies are not needed for training. Consequently, the maximized domain critic loss can replace the nuclear-norm discrepancy:
Lw(˜xs,˜xt)=1NsNs∑i=1D(G(˜xsi))−1NtNt∑j=1D(G(˜xti)), | (3.11) |
WH(˜Ds,˜Dt)=maxDLw(˜xs,˜xt). | (3.12) |
In this adversarial learning framework, to enhance the diversity of the model's alternating updates, we use a gradient reversal layer (GRL) that does not include the aforementioned weight clipping and gradient penalty strategies. Under these conditions, the model training follows a max-min approach:
minGmaxCLw(˜xs,˜xt). | (3.13) |
Meanwhile, to prevent the source domain data from excessively influencing the prediction results and to ensure the authenticity of the UDA classification, both the generator G and the classifier C need to be optimized by minimizing the classification loss :
Lc(˜xs,˜ys)=1NsNs∑i=1Lce(C(G(˜xsi)),˜ysi). | (3.14) |
To ensure a good balance between the alignment and classification, and to have both working together towards better results, this section introduces a dynamic balancing factor. This factor controls the domain alignment loss and class separability loss, thus allowing for the real-time monitoring of the alignment and separability during the training process. We use the Maximum Mean Discrepancy (MMD) and Linear Discriminant Analysis (LDA) to measure the current feature representation's cross-domain alignment and separability, respectively. MMD is primarily used to measure the distance between two different but related distributions:
MMD(Ds,Dt)=‖Fxsi∼DsG(˜xsi)−Fxtj∼DtG(˜xtj)‖2. | (3.15) |
The LDA-based separability estimator maxL(D) is given by the following:
maxwL(D)=tr(WTPbW)tr(WTPwW), | (3.16) |
where Pb represents the between-class scatter matrix, and Pw denotes the within-class scatter matrix. Clearly, the larger the value of L(D), the better the discriminative performance of the method.
Since the evaluation criteria of the two measures are not on the same scale, their estimated values are normalized using min-max scaling to linearly transform the evaluation values and map the results to the range [0, 1]. This establishes a dynamic balancing factor. Let M˜MD(Ds,Dt) and ˜L(D) represent the normalized values of MMD(Ds,Dt) and maxL(D), respectively:
M˜MD(Ds,Dt)=MMD(Ds,Dt)−MMD(Ds,Dt)minMMD(Ds,Dt)max−MMD(Ds,Dt)min, | (3.17) |
˜L(D)=L(D)−L(D)minL(D)max−L(D)min. | (3.18) |
Through normalization, the equations are M˜MD(Ds,Dt)∈(0,1] and ˜L(D)∈(0,1]. Thus, a dynamic balancing factor is constructed as follows:
α=M˜MD(Ds,Dt)M˜MD(Ds,Dt)+(1−˜L(D)). | (3.19) |
In this equation, a smaller M˜MD(Ds,Dt) indicates a better domain alignment, and a smaller 1−˜L(D) indicates a better class separability. When M˜MD(Ds,Dt) approaches 0 and 1−˜L(D) approaches 1, it signifies that the alignment effect is much better than the classification effect, thus resulting in α approaching 0. Conversely, when M˜MD(Ds,Dt) approaches 1 and 1−˜L(D) approaches 0, it indicates that the classification effect is much better than the alignment effect, thus resulting in α approaching 1. Based on this, α is used as the weight for the domain alignment loss, and 1−α is used as the weight for the class separability loss. When α and 1−α are close, it indicates that the alignment effect and the classification effect are also close. Therefore, the dynamic weighting model for domain alignment and class discrimination is as follows:
minG,C{(1−α)Ladv(˜xs,˜ys)+αmaxCLw(˜xs,˜xt)}. | (3.20) |
In summary, our method ultimately aims to optimize an overall loss function:
L=minG,C{Lc+αmaxCLadv+(1−α)maxCLw}, | (3.21) |
where the dynamic balancing factor Ladv is used to weigh the domain alignment loss, and Lw is used to weigh the class separability loss, thus ensuring that both the alignment and classification effects are balanced throughout the training process.
In this section, we will introduce the dataset and model parameter settings, give the experimental results of our proposed method, compare it with other methods, and finally analyze and interpret the experiments from multiple perspectives of data and visualization.
We performed experiments on four classical UDA image classification datasets, namely Office-31[39], ImageCLEF[31], Office Home[4], and VisDA-2017[40]:
(1) Office-31 is a common benchmark dataset for domain adaptation, covering images from three domains: Amazon products (A), DSLR cameras (D), and webcams (W). Each domain is comprised of 31 categories with a total of 4110 images. We experimented with all domain combinations.
(2) Image-CLEF is also a widely used standard domain adaptation dataset, consisting of three domains with significant style differences: Caltech-256 (C), ImageNet ILSVRC2012 (I), and PASCALVOC2012 (P). Each domain contains 600 images from 50 categories. We conducted all six experiments on the Image-CLEF dataset.
(3) Office-Home is a highly challenging UDA dataset, containing samples from four distinct domains: art images(A), clip art(C), product images(P), and real-world images(R). It includes 15,500 images of 65 categories of everyday objects in office settings.
(4) VisDA is a large-scale domain adaptation dataset comprised of two domains: synthetics and real, with over 280k images across 12 categories.
According to the specifications of unsupervised domain adaptation, labeled source domain samples and all unlabeled target domain samples are involved during the training process. The proposed method is implemented based on the PyTorch framework. We use the SGD optimizer for training with a momentum of 0.9, a weight decay of 1e-3, a batch size of 36, and a cropped image size of 224 × 224 for the model. The initial learning rate for the classifier C is set to 5e-3, which is 10 times that of the feature extractor G.
Baseline: To evaluate the effectiveness of our method, we choose a baseline that does not reuse a discriminator, using MMD loss as the adversarial loss, and setting the balance parameters α and (1−α) to 0.5 and 0.5 for the transfer model. We compare this baseline with our method on the aforementioned dataset to validate the effectiveness of our approach.
The results for Office-Home are shown in Table 1. By comparison, it is evident that the proposed method significantly improves the classification accuracy. For instance, in tasks with substantial inter-domain differences and class imbalances, such as A→R and C→R, our method achieves accuracies of 80.1 and 74.8%, respectively. Furthermore, in the P→A, P→C, and P→R tasks, the obtained accuracies are consistently higher than those of the existing methods on average. This improvement is attributed to the introduction of constraints, the effective utilization of predicted discriminative information, and maintaining a balance between the domain alignment and the class alignment.
Methods | A→C | A→P | A→R | C→A | C→P | C→R | P→A | P→C | P→R | R→A | R→C | R→P | Avg |
ResNet-50[41] | 34.9 | 50.0 | 58.0 | 37.4 | 41.9 | 46.2 | 38.5 | 31.2 | 60.4 | 53.9 | 41.2 | 59.9 | 46.1 |
WDGRL[42] | 44.1 | 63.8 | 74.0 | 47.3 | 57.1 | 61.7 | 51.8 | 39.1 | 72.1 | 64.9 | 45.9 | 76.5 | 58.2 |
JAN[31] | 45.9 | 61.2 | 68.9 | 50.4 | 59.7 | 61.0 | 45.8 | 43.4 | 70.3 | 63.9 | 52.4 | 76.8 | 58.3 |
DANN[2] | 45.6 | 59.3 | 70.1 | 47.0 | 58.5 | 60.9 | 46.1 | 43.7 | 68.5 | 63.2 | 51.8 | 76.8 | 57.6 |
MCD[5] | 48.9 | 68.3 | 74.6 | 61.3 | 67.6 | 68.8 | 57.0 | 47.1 | 75.1 | 69.1 | 52.2 | 79.6 | 64.1 |
HAFN[43] | 50.2 | 70.1 | 76.6 | 61.1 | 68.0 | 70.7 | 59.5 | 48.4 | 77.3 | 69.4 | 53.0 | 80.2 | 65.4 |
ETD[44] | 51.3 | 71.9 | 85.7 | 57.6 | 69.2 | 73.7 | 57.8 | 51.2 | 79.3 | 70.2 | 57.5 | 82.1 | 67.3 |
FGDA[45] | 51.8 | 72.0 | 79.2 | 62.7 | 72.0 | 73.9 | 60.0 | 49.7 | 79.2 | 69.5 | 56.8 | 82.3 | 67.4 |
DFE-DA[46] | 56.4 | 74.9 | 78.2 | 62.8 | 72.3 | 73.2 | 62.3 | 53.8 | 80.4 | 72.4 | 60.5 | 82.8 | 69.2 |
ETA[44] | 51.3 | 71.9 | 85.7 | 57.6 | 69.2 | 73.7 | 57.8 | 51.2 | 79.3 | 70.2 | 57.5 | 82.1 | 67.3 |
DMDA[47] | 55.7 | 75.9 | 78.6 | 56.7 | 76.1 | 73.9 | 57.9 | 51.2 | 79.2 | 66.8 | 57.6 | 82.6 | 67.7 |
BuresNet[42] | 54.7 | 74.4 | 77.1 | 63.7 | 72.2 | 71.8 | 64.1 | 51.7 | 78.4 | 73.1 | 58.0 | 82.4 | 68.5 |
AEGDM[48] | 56.0 | 70.7 | 79.8 | 64.6 | 73.1 | 71.6 | 61.2 | 53.1 | 79.1 | 71.6 | 59.7 | 84.3 | 68.8 |
Baseline | 55.3 | 70.9 | 76.2 | 62.1 | 69.8 | 75.6 | 60.8 | 53.2 | 78.4 | 69.3 | 58.6 | 80.7 | 67.6 |
Ours | 58.7 | 75.6 | 79.5 | 64.8 | 75.3 | 76.7 | 65.5 | 56.7 | 81.2 | 74.6 | 59.8 | 86.0 | 71.2 |
The results for ImageCLEF are shown in Table 2, which presents the experimental outcomes of six domain adaptation tasks and the average accuracy on this dataset. Our method achieves the optimal average accuracy of 90.4%, with the best performance in the I→P and P→I tasks.
Methods | I→P | P→I | I→C | C→I | C→P | P→C | Avg |
ResNet-50[41] | 74.8 | 83.9 | 91.5 | 78.0 | 65.5 | 91.2 | 80.7 |
WDGRL[42] | 76.8 | 87.0 | 91.7 | 87.2 | 75.2 | 90.3 | 84.7 |
MCD[5] | 77.3 | 89.2 | 92.7 | 88.2 | 71.0 | 92.3 | 85.1 |
JAN[31] | 76.8 | 88.0 | 94.7 | 89.5 | 74.2 | 91.7 | 85.8 |
DANN[2] | 75.0 | 86.0 | 96.2 | 84.0 | 74.3 | 91.5 | 85.0 |
CAT[49] | 76.7 | 89.0 | 94.5 | 89.8 | 74.0 | 93.7 | 86.3 |
AEGDM[48] | 81.4 | 93.2 | 98.1 | 92.1 | 78.1 | 96.5 | 89.9 |
IWCA[50] | 77.5 | 91.3 | 97.0 | 90.5 | 75.8 | 95.3 | 87.9 |
MEDM-LS[51] | 78.2 | 93.3 | 97.2 | 93.0 | 78.3 | 95.5 | 89.3 |
CGDM[9] | 78.7 | 93.3 | 97.5 | 92.7 | 79.2 | 95.7 | 89.5 |
Baseline | 80.6 | 89.4 | 94.9 | 89.3 | 75.4 | 92.6 | 87.0 |
Ours | 83.0 | 94.5 | 98.0 | 92.5 | 78.0 | 96.3 | 90.4 |
The results for Office-31 are shown in Table 3. The proposed model achieves an optimal performance with an accuracy of 88.9% on average. Specifically, in the task A→W, our method achieves a high accuracy of 93%, which is the highest among the comparative models. In the relatively more challenging tasks of D→A and W→A, our method demonstrates a relatively stable performance despite the significant domain distribution inconsistency and sample imbalance issues such as adapting from domains with fewer samples (D and W) to a domain with more samples (A). This indicates that our model not only has a strong robustness in the domain alignment, but is also capable of effectively handling the complexity of the class distribution caused by a larger number of target domain samples.
Methods | A→W | D→W | W→D | A→D | D→A | W→A | Avg |
ResNet-50[41] | 68.4 | 96.7 | 99.3 | 68.9 | 62.5 | 60.7 | 76.1 |
DAN[21] | 80.5 | 97.1 | 99.6 | 78.6 | 63.6 | 62.8 | 80.4 |
WDGRL[42] | 72.6 | 97.1 | 99.2 | 79.5 | 63.7 | 59.5 | 78.6 |
DANN[2] | 82.6 | 96.9 | 99.3 | 81.5 | 68.4 | 67.5 | 82.7 |
ADDA[52] | 86.2 | 96.2 | 98.4 | 77.8 | 69.5 | 68.9 | 82.9 |
MADA[53] | 90.0 | 97.4 | 99.6 | 87.8 | 70.3 | 66.4 | 85.2 |
CAT[49] | 91.1 | 98.6 | 99.6 | 90.6 | 70.4 | 66.5 | 86.1 |
ETD[44] | 92.1 | 100.0 | 100.0 | 88.0 | 71.0 | 67.8 | 86.2 |
JBL[54] | 91.2 | 97.6 | 100.0 | 86.9 | 70.5 | 71.8 | 86.3 |
DFE-DA[46] | 88.3 | 99.4 | 100.0 | 87.6 | 74.3 | 73.1 | 86.9 |
DMDA[47] | 91.6 | 98.6 | 99.4 | 90.0 | 73.8 | 74.0 | 87.7 |
BSWD[55] | 90.1 | 99.0 | 100.0 | 89.0 | 75.9 | 72.8 | 87.8 |
Baseline | 86.7 | 98.2 | 99.4 | 86.6 | 70.4 | 68.3 | 84.9 |
Ours | 93.0 | 99.8 | 100.0 | 91.0 | 75.6 | 73.7 | 88.9 |
The results for VisDA-2017 are shown in Table 4, where our model achieves an average accuracy of 83.0%. The method dynamically weighs the real-time estimation of transferability and discriminability during the iteration process, thus ensuring both aspects evolve towards an improved performance.
Methods | I→P |
ResNet-50[41] | 74.8 |
WDGRL[42] | 76.8 |
MCD[5] | 77.3 |
DANN[2] | 75.0 |
TCM[56] | 75.8 |
CGDM[9] | 82.3 |
Baseline | 75.9 |
Ours | 83.0 |
In experiments conducted on all datasets, we perform a longitudinal comparison with the baseline. The experimental results demonstrate that our method fully takes the role of the domain information and class information during the model training process into account, thus effectively balancing the focus on the inter-domain distribution differences and the class discrimination capability. Our method achieves superior performance compared to the baseline, thus proving its effectiveness.
Ablation Study: In this section, we analyze the impact of the Wasserstein Difference and the Dynamic Balancing Factor on the performance improvement. The results are shown in Table 5. We conduct ablation experiments on the Office-31 dataset across three tasks (A→D, D→W, W→A) by evaluating the effects of using or not using the Wasserstein Difference Loss in adversarial training and the Dynamic Balancing mechanism. In the second column of the table, the Wasserstein Difference Loss is replaced with MMD Loss for adversarial training. From the overall results in the table, it can be observed that removing both the Wasserstein Difference and the Dynamic Balancing mechanism negatively impacts the performance. Furthermore, it is evident that the use of the Wasserstein Difference Loss in the adversarial training of the discriminator has a more significant effect on the performance. Compared to the fixed balancing parameters, the Dynamic Balancing mechanism effectively maintains a better trade-off between the domain alignment and the classification, thus leading to consistent performance improvements.
Lw | MMD loss | Dynamic factor | fixed parameter | A→D | D→W | W→A |
× | √ | × | 5:5 | 86.6 | 98.2 | 68.3 |
√ | × | × | 5:5 | 90.1 | 99.6 | 73.1 |
× | √ | √ | - | 88.6 | 99.2 | 72.8 |
√ | × | √ | - | 91.0 | 99.8 | 73.7 |
Analysis of Dynamic Weighting Parameters: To specifically analyze the effect of the dynamic weighting strategy, we conducted experiments using static parameter settings. The experimental results are shown in Table 6. From the data, it can be observed that the dynamic weighting strategy we employed achieves a better performance compared to the static hyperparameter settings, thus demonstrating its effectiveness in balancing the domain alignment and classification. As shown in Figure 3.6, we plot the classification error rate and the variation of the dynamic balancing parameter for the A→D task on the Office-31 dataset. It can be observed that as training progresses, the error rate of label prediction gradually converges. During the early stages of training, the domain information of the images is given a higher importance, thus enabling a rapid improvement in the domain alignment. In the middle stages of training, the dynamic parameter gradually increases, thus emphasizing the importance of the classification information, thereby further enhancing the classification process.
weight | 1: 9 | 3: 7 | 5: 5 | 7: 3 | 9: 1 | α:(1−α) |
Accuracy | 90.3 | 89.4 | 90.1 | 90.6 | 88.6 | 91.0 |
Confusion Matrix: The confusion matrix is shown in Figure 3. If the model is solely trained on the source data, then it suffers from severe class confusion, significantly reducing classification accuracy, as depicted in Figure 3(a). DANN overlooks the discriminability between features, thus solely focusing on domain alignment, which leads to a misclassification in certain categories, as shown in Figure 3(a). In our method, the main diagonal elements of the matrix exhibit the highest values, with significantly reduced off-diagonal elements, thus indicating the absence of a misclassification, as illustrated in Figure 3(c). This demonstrates the effectiveness of our approach.
Certainty is evaluated by calculating the ratio of correctly classified samples with high prediction confidence. Here, we consider the task A→R, where a prediction probability between 0.9 and 1 is deemed a high-confidence prediction. As shown in Figure 2(c), it is evident that a model trained solely on the source domain is almost incapable of generating a high ratio of confident predictions. DANN significantly increases the ratio of high-confidence predictions, and DMDA further improves it to 85.7%. However, all these methods have lower ratios compared to the method proposed in this chapter, thus demonstrating its effectiveness in enhancing the prediction certainty.
t-SNE Visualization: The t-SNE visualization is shown in Figure 4. The DBDA proposed in this chapter results in more compact intra-class distributions and more dispersed inter-class distributions, thus indicating that the features learned by the DBDA possess a stronger discriminative power. The intra-class features are pulled together, while the inter-class features are pushed apart.
In our work, we proposed a simple yet effective adversarial paradigm that reused task-specific classifiers as discriminators. To implement this paradigm, we utilized a clearly guided difference, and accordingly established a discriminator-free adversarial UDA model. This model learns transferable and discriminative representations while ensuring the prediction certainty and diversity. Furthermore, based on a dynamic weighting algorithm, it efficiently adjusts the weights of domain alignment and class alignment during computation, making it suitable for various applications and achieving a dynamic balance. The test results on multiple samples indicated that the proposed algorithm performed well.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
This work was supported by the Open Fund of Key Laboratory of Philosophy and Social Science of Anhui Province on Adolescent Mental Health and Crisis Intelligence Intervention (SYS2024B05), Anhui Mine IOT and Security Monitoring Technology Key Laboratory(2109Y-09-04), Natural Science Research Project of Anhui Educational Committee(2022AH052144), Natural Science Research Project of Hefei Normal University(2022SKZD19).
The authors declare there is no conflicts of interest.
[1] | L. Zhu, L. L. Chan, T. K. Ng, M. Zhang, B. C. Ooi, Deep co-training for cross-modality medical image segmentation, in ECAI 2023, 372 (2023), 3140–3147. https://doi.org/10.3233/FAIA230633 |
[2] | Y. Ganin, V. Lempitsky, Unsupervised domain adaptation by backpropagation, in Proceedings of the 32nd International Conference on International Conference on Machine Learning, 37 (2015), 1180–1189. Available from: https://dl.acm.org/doi/10.5555/3045118.3045244. |
[3] |
L. Abdi, S. Hashemi, Unsupervised domain adaptation based on correlation maximization, IEEE Access, 9 (2021), 127054–127067. https://doi.org/10.1109/ACCESS.2021.3111586 doi: 10.1109/ACCESS.2021.3111586
![]() |
[4] | H. Venkateswara, J. Eusebio, S. Chakraborty, S. Panchanathan, Deep hashing network for unsupervised domain adaptation, in 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 5018–5027. https://doi.org/10.1109/CVPR.2017.572 |
[5] | K. Saito, K. Watanabe, Y. Ushiku, T. Harada, Maximum classifier discrepancy for unsupervised domain adaptation, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2018), 3723–3732. https://doi.org/10.1109/CVPR.2018.00392 |
[6] | B. Xie, L. Yuan, S. Li, C. H. Liu, X. Cheng, G. Wang, Active learning for domain adaptation: An energy-based approach, in Proceedings of the AAAI Conference on Artificial Intelligence, 36 (2022), 8708–8716. https://doi.org/10.1609/aaai.v36i8.20850 |
[7] | E. Tzeng, J. Hoffman, N. Zhang, K. Saenko, T. Darrell, Deep domain confusion: Maximizing for domain invariance, preprint, arXiv: 1412.3474. |
[8] |
S. Li, S. Song, G. Huang, Z. Ding, C. Wu, Domain invariant and class discriminative feature learning for visual domain adaptation, IEEE Trans. Image Process., 27 (2018), 4260–4273. https://doi.org/10.1109/TIP.2018.2839528 doi: 10.1109/TIP.2018.2839528
![]() |
[9] | Z. Du, J. Li, H. Su, L. Zhu, K. Lu, Cross-domain gradient discrepancy minimization for unsupervised domain adaptation, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 3937–3946. https://doi.org/10.1109/CVPR46437.2021.00393 |
[10] | T. Chen, S. Kornblith, M. Norouzi, G. Hinton, A simple framework for contrastive learning of visual representations, in Proceedings of the 37th International Conference on Machine Learning, 119 (2020), 1597–1607. |
[11] | C. Park, J. Lee, J. Yoo, M. Hur, S. Yoon, Joint contrastive learning for unsupervised domain adaptation, preprint, arXiv: 2006.10297. |
[12] |
Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, et al., Domain-adversarial training of neural networks, J. Mach. Learn. Res., 17 (2016), 1–35. https://doi.org/10.1007/978-3-319-58347-1_10 doi: 10.1007/978-3-319-58347-1_10
![]() |
[13] | H. Tang, K. Jia, Discriminative adversarial domain adaptation, in Proceedings of the AAAI Conference on Artificial Intelligence, 34 (2020), 5940–5947. https://doi.org/10.1609/aaai.v34i04.6054 |
[14] | N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, S. Zagoruyko, End-to-end object detection with transformers, in European Conference on Computer Vision, 12346 (2020), 213–229. https://doi.org/10.1007/978-3-030-58452-8_13 |
[15] | Z. Gao, S. Zhang, K. Huang, Q. Wang, C. Zhong, Gradient distribution alignment certificates better adversarial domain adaptation, in 2021 IEEE/CVF International Conference on Computer Vision, (2021), 8937–8946. https://doi.org/10.1109/ICCV48922.2021.00881 |
[16] | H. Liu, M. Long, J. Wang, M. Jordan, Transferable adversarial training: A general approach to adapting deep classifiers, in Proceedings of the 36th International Conference on Machine Learning, 97 (2019), 4013–4022. Available from: https://proceedings.mlr.press/v97/liu19b.html. |
[17] | M. Xu, J. Zhang, B. Ni, T. Li, C. Wang, Q. Tian, et al., Adversarial domain adaptation with domain mixup, in Proceedings of the AAAI Conference on Artificial Intelligence, 34 (2020), 6502–6509. https://doi.org/10.1609/aaai.v34i04.6123 |
[18] | M. Long, Z. Cao, J. Wang, M. I. Jordan, Conditional adversarial domain adaptation, in Proceedings of the 32nd International Conference on Neural Information Processing Systems, 31 (2018), 1647–1657. Available from: https://dl.acm.org/doi/10.5555/3326943.3327094. |
[19] | M. Mirza, S. Osindero, Conditional generative adversarial nets, preprint, arXiv: 1411.1784. |
[20] | X. Chen, S. Wang, M. Long, J. Wang, Transferability vs. discriminability: Batch spectral penalization for adversarial domain adaptation, in Proceedings of the 36th International Conference on Machine Learning, 97 (2019), 1081–1090. Available from: https://proceedings.mlr.press/v97/chen19i.html. |
[21] | M. Long, Y. Cao, J. Wang, M. Jordan, Learning transferable features with deep adaptation networks, in Proceedings of the 32nd International Conference on Machine Learning, 37 (2015), 97–105. |
[22] |
C. X. Ren, Y. W. Luo, D. Q. Dai, BuresNet: Conditional bures metric for transferable representation learning, IEEE Trans. Pattern Anal. Mach. Intell., 45 (2022), 4198–4213. https://doi.org/10.1109/TPAMI.2022.3190645 doi: 10.1109/TPAMI.2022.3190645
![]() |
[23] | K. Shen, R. M. Jones, A. Kumar, S. M. Xie, J. Z. Haochen, T. Ma, et al., Connect, not collapse: Explaining contrastive learning for unsupervised domain adaptation, in Proceedings of the 39th International Conference on Machine Learning, 162 (2022), 19847–19878. |
[24] | M. Thota, G. Leontidis, Contrastive domain adaptation, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), (2021), 2209–2218. https://doi.org/10.1109/CVPRW53098.2021.00250 |
[25] | Y. Wang, Y. Li, S. Li, W. Song, J. Fan, S. Gao, et al., Deep graph mutual learning for cross-domain recommendation, in International Conference on Database Systems for Advanced Applications, 13246 (2022), 298–305. https://doi.org/10.1007/978-3-031-00126-0_22 |
[26] | Y. Wang, Y. Song, S. Li, C. Cheng, W. Ju, M. Zhang, et al., Disencite: Graph-based disentangled representation learning for context-specific citation generation, in Proceedings of the AAAI Conference on Artificial Intelligence, 36 (2022), 11449–11458. https://doi.org/10.1609/aaai.v36i10.21397 |
[27] | Y. Wang, X. Luo, C. Chen, X. S. Hua, M. Zhang, W. Ju, DisenSemi: Semi-supervised graph classification via disentangled representation learning, IEEE Trans. Neural Networks Learn. Syst., (2024), 1–13. https://doi.org/10.1109/tnnls.2024.3431871 |
[28] | Y. Wang, Y. Qin, F. Sun, B. Zhang, X. Hou, K. Hu, et al., DisenCTR: Dynamic graph-based disentangled representation for click-through rate prediction, in Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, (2022), 2314–2318. https://doi.org/10.1145/3477495.3531851 |
[29] | Y. Wang, S. Tang, Y. Lei, W. Song, S. Wang, M. Zhang, Disenhan: Disentangled heterogeneous graph attention network for recommendation, in Proceedings of the 29th ACM International Conference on Information & Knowledge Management, (2020), 1605–1614. https://doi.org/10.1145/3340531.3411996 |
[30] | I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, et al., Generative adversarial nets, in Proceedings of the 27th International Conference on Neural Information Processing Systems, 27 (2014). Available from: https://dl.acm.org/doi/10.5555/2969033.2969125. |
[31] | M. Long, H. Zhu, J. Wang, M. I. Jordan, Deep transfer learning with joint adaptation networks, in Proceedings of the 34th International Conference on Machine Learning, 70 (2017), 2208–2217. |
[32] |
C. Zhu, Q. Wang, Y. Xie, S. Xu, Multiview latent space learning with progressively fine-tuned deep features for unsupervised domain adaptation, Inf. Sci., 662 (2024), 120223. https://doi.org/10.1016/j.ins.2024.120223 doi: 10.1016/j.ins.2024.120223
![]() |
[33] | X. Liu, S. Zhang, Graph consistency based mean-teaching for unsupervised domain adaptive person re-identification, preprint, arXiv: 2105.04776. |
[34] |
C. Zhu, L. Zhang, W. Luo, G. Jiang, Q. Wang, Tensorial multiview low-rank high-order graph learning for context-enhanced domain adaptation, Neural Networks, 181 (2024), 106859. https://doi.org/10.1016/j.neunet.2024.106859 doi: 10.1016/j.neunet.2024.106859
![]() |
[35] | I. P. Singh, E. Ghorbel, A. Kacem, A. Rathinam, D. Aouada, Discriminator-free unsupervised domain adaptation for multi-label image classification, in 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), (2024), 3924–3933. https://doi.org/10.1109/wacv57701.2024.00389 |
[36] | Z. Chen, J. Wei, R. Li, Unsupervised multi-modal medical image registration via discriminator-free image-to-image translation, in Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, (2022), 834–840. https://doi.org/10.24963/ijcai.2022/117 |
[37] |
J. Wang, Y. Chen, W. Feng, H. Yu, M. Huang, Q. Yang, Transfer learning with dynamic distribution adaptation, ACM Trans. Intell. Syst. Technol., 11 (2020), 2157–6904. https://doi.org/10.1145/3360309 doi: 10.1145/3360309
![]() |
[38] | Y. Li, L. Yuan, Y. Chen, P. Wang, N. Vasconcelos, Dynamic transfer for multi-source domain adaptation, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 10993–11002. https://doi.org/10.1109/cvpr46437.2021.01085 |
[39] | K. Saenko, B. Kulis, M. Fritz, T. Darrell, Adapting visual category models to new domains, in Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, 6314 (2010), 213–226. https://doi.org/10.1007/978-3-642-15561-1_16 |
[40] | X. Peng, B. Usman, N. Kaushik, J. Hoffman, D. Wang, K. Saenko, Visda: The visual domain adaptation challenge, preprint, arXiv: 1710.06924. |
[41] | K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 770–778. https://doi.org/10.1109/CVPR.2016.90 |
[42] | J. Shen, Y. Qu, W. Zhang, Y. Yu, Wasserstein distance guided representation learning for domain adaptation, in Proceedings of the AAAI Conference on Artificial Intelligence, 32 (2018). https://doi.org/10.1609/aaai.v32i1.11784 |
[43] | R. Xu, G. Li, J. Yang, L. Lin, Larger norm more transferable: An adaptive feature norm approach for unsupervised domain adaptation, in 2019 IEEE/CVF International Conference on Computer Vision, (2019), 1426–1435. https://doi.org/10.1109/ICCV.2019.00151 |
[44] | M. Li, Y. M. Zhai, Y. W. Luo, P. F. Ge, C. X. Ren, Enhanced transport distance for unsupervised domain adaptation, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 13936–13944. https://doi.org/10.1109/CVPR42600.2020.01395 |
[45] |
J. Li, Z. Li, S. Lü, Feature concatenation for adversarial domain adaptation, Expert Syst. Appl., 169 (2021), 114490. https://doi.org/10.1016/j.eswa.2020.114490 doi: 10.1016/j.eswa.2020.114490
![]() |
[46] |
Y. Li, Y. Liu, D. Zheng, Y. Huang, Y. Tang, Discriminable feature enhancement for unsupervised domain adaptation, Image Vision Comput., 137 (2023), 104755. https://doi.org/10.1016/j.imavis.2023.104755 doi: 10.1016/j.imavis.2023.104755
![]() |
[47] |
S. Yao, Q. Kang, M. Zhou, M. J. Rawa, A. Albeshri, Discriminative manifold distribution alignment for domain adaptation, IEEE Trans. Syst. Man Cybern.: Syst., 53 (2022), 1183–1197. https://doi.org/10.1109/TSMC.2022.3195239 doi: 10.1109/TSMC.2022.3195239
![]() |
[48] |
Q. Tian, H. Yang, Z. Lu, M. Liu, Unsupervised domain adaptation through adversarial enhancement and gradient discrepancy minimization, Comput. Electr. Eng., 105 (2023), 108483. https://doi.org/10.1016/j.compeleceng.2022.108483 doi: 10.1016/j.compeleceng.2022.108483
![]() |
[49] | Z. Deng, Y. Luo, J. Zhu, Cluster alignment with a teacher for unsupervised domain adaptation, in 2019 IEEE/CVF International Conference on Computer Vision, (2019), 9944–9953. https://doi.org/10.1109/ICCV.2019.01004 |
[50] |
P. Liu, T. Xiao, C. Fan, W. Zhao, X. Tang, H. Liu, Importance-weighted conditional adversarial network for unsupervised domain adaptation, Expert Syst. Appl., 155 (2020), 113404. https://doi.org/10.1016/j.eswa.2020.113404 doi: 10.1016/j.eswa.2020.113404
![]() |
[51] |
X. Wu, S. Zhang, Q. Zhou, Z. Yang, C. Zhao, L. J. Latecki, Entropy minimization versus diversity maximization for domain adaptation, IEEE Trans. Neural Networks Learn. Syst., 34 (2021), 2896–2907. https://doi.org/10.1109/TNNLS.2021.3110109 doi: 10.1109/TNNLS.2021.3110109
![]() |
[52] | E. Tzeng, J. Hoffman, K. Saenko, T. Darrell, Adversarial discriminative domain adaptation, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 7167–7176. https://doi.org/10.1109/cvpr.2017.316 |
[53] | Z. Pei, Z. Cao, M. Long, J. Wang, Multi-adversarial domain adaptation, in Proceedings of the AAAI Conference on Artificial Intelligence, 32 (2018). https://doi.org/10.1609/aaai.v32i1.11767 |
[54] |
Q. Tian, J. Zhou, Y. Chu, Joint bi-adversarial learning for unsupervised domain adaptation, Knowledge-Based Syst., 248 (2022), 108903. https://doi.org/10.1016/j.knosys.2022.108903 doi: 10.1016/j.knosys.2022.108903
![]() |
[55] |
J. Gu, X. Qian, Q. Zhang, H. Zhang, F. Wu, Unsupervised domain adaptation for covid-19 classification based on balanced slice wasserstein distance, Comput. Biol. Med., 164 (2023), 107207. https://doi.org/10.1016/j.compbiomed.2023.107207 doi: 10.1016/j.compbiomed.2023.107207
![]() |
[56] | Z. Yue, Q. Sun, X. S. Hua, H. Zhang, Transporting causal mechanisms for unsupervised domain adaptation, in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), (2021), 8599–8608. https://doi.org/10.1109/ICCV48922.2021.00848 |
Methods | A→C | A→P | A→R | C→A | C→P | C→R | P→A | P→C | P→R | R→A | R→C | R→P | Avg |
ResNet-50[41] | 34.9 | 50.0 | 58.0 | 37.4 | 41.9 | 46.2 | 38.5 | 31.2 | 60.4 | 53.9 | 41.2 | 59.9 | 46.1 |
WDGRL[42] | 44.1 | 63.8 | 74.0 | 47.3 | 57.1 | 61.7 | 51.8 | 39.1 | 72.1 | 64.9 | 45.9 | 76.5 | 58.2 |
JAN[31] | 45.9 | 61.2 | 68.9 | 50.4 | 59.7 | 61.0 | 45.8 | 43.4 | 70.3 | 63.9 | 52.4 | 76.8 | 58.3 |
DANN[2] | 45.6 | 59.3 | 70.1 | 47.0 | 58.5 | 60.9 | 46.1 | 43.7 | 68.5 | 63.2 | 51.8 | 76.8 | 57.6 |
MCD[5] | 48.9 | 68.3 | 74.6 | 61.3 | 67.6 | 68.8 | 57.0 | 47.1 | 75.1 | 69.1 | 52.2 | 79.6 | 64.1 |
HAFN[43] | 50.2 | 70.1 | 76.6 | 61.1 | 68.0 | 70.7 | 59.5 | 48.4 | 77.3 | 69.4 | 53.0 | 80.2 | 65.4 |
ETD[44] | 51.3 | 71.9 | 85.7 | 57.6 | 69.2 | 73.7 | 57.8 | 51.2 | 79.3 | 70.2 | 57.5 | 82.1 | 67.3 |
FGDA[45] | 51.8 | 72.0 | 79.2 | 62.7 | 72.0 | 73.9 | 60.0 | 49.7 | 79.2 | 69.5 | 56.8 | 82.3 | 67.4 |
DFE-DA[46] | 56.4 | 74.9 | 78.2 | 62.8 | 72.3 | 73.2 | 62.3 | 53.8 | 80.4 | 72.4 | 60.5 | 82.8 | 69.2 |
ETA[44] | 51.3 | 71.9 | 85.7 | 57.6 | 69.2 | 73.7 | 57.8 | 51.2 | 79.3 | 70.2 | 57.5 | 82.1 | 67.3 |
DMDA[47] | 55.7 | 75.9 | 78.6 | 56.7 | 76.1 | 73.9 | 57.9 | 51.2 | 79.2 | 66.8 | 57.6 | 82.6 | 67.7 |
BuresNet[42] | 54.7 | 74.4 | 77.1 | 63.7 | 72.2 | 71.8 | 64.1 | 51.7 | 78.4 | 73.1 | 58.0 | 82.4 | 68.5 |
AEGDM[48] | 56.0 | 70.7 | 79.8 | 64.6 | 73.1 | 71.6 | 61.2 | 53.1 | 79.1 | 71.6 | 59.7 | 84.3 | 68.8 |
Baseline | 55.3 | 70.9 | 76.2 | 62.1 | 69.8 | 75.6 | 60.8 | 53.2 | 78.4 | 69.3 | 58.6 | 80.7 | 67.6 |
Ours | 58.7 | 75.6 | 79.5 | 64.8 | 75.3 | 76.7 | 65.5 | 56.7 | 81.2 | 74.6 | 59.8 | 86.0 | 71.2 |
Methods | I→P | P→I | I→C | C→I | C→P | P→C | Avg |
ResNet-50[41] | 74.8 | 83.9 | 91.5 | 78.0 | 65.5 | 91.2 | 80.7 |
WDGRL[42] | 76.8 | 87.0 | 91.7 | 87.2 | 75.2 | 90.3 | 84.7 |
MCD[5] | 77.3 | 89.2 | 92.7 | 88.2 | 71.0 | 92.3 | 85.1 |
JAN[31] | 76.8 | 88.0 | 94.7 | 89.5 | 74.2 | 91.7 | 85.8 |
DANN[2] | 75.0 | 86.0 | 96.2 | 84.0 | 74.3 | 91.5 | 85.0 |
CAT[49] | 76.7 | 89.0 | 94.5 | 89.8 | 74.0 | 93.7 | 86.3 |
AEGDM[48] | 81.4 | 93.2 | 98.1 | 92.1 | 78.1 | 96.5 | 89.9 |
IWCA[50] | 77.5 | 91.3 | 97.0 | 90.5 | 75.8 | 95.3 | 87.9 |
MEDM-LS[51] | 78.2 | 93.3 | 97.2 | 93.0 | 78.3 | 95.5 | 89.3 |
CGDM[9] | 78.7 | 93.3 | 97.5 | 92.7 | 79.2 | 95.7 | 89.5 |
Baseline | 80.6 | 89.4 | 94.9 | 89.3 | 75.4 | 92.6 | 87.0 |
Ours | 83.0 | 94.5 | 98.0 | 92.5 | 78.0 | 96.3 | 90.4 |
Methods | A→W | D→W | W→D | A→D | D→A | W→A | Avg |
ResNet-50[41] | 68.4 | 96.7 | 99.3 | 68.9 | 62.5 | 60.7 | 76.1 |
DAN[21] | 80.5 | 97.1 | 99.6 | 78.6 | 63.6 | 62.8 | 80.4 |
WDGRL[42] | 72.6 | 97.1 | 99.2 | 79.5 | 63.7 | 59.5 | 78.6 |
DANN[2] | 82.6 | 96.9 | 99.3 | 81.5 | 68.4 | 67.5 | 82.7 |
ADDA[52] | 86.2 | 96.2 | 98.4 | 77.8 | 69.5 | 68.9 | 82.9 |
MADA[53] | 90.0 | 97.4 | 99.6 | 87.8 | 70.3 | 66.4 | 85.2 |
CAT[49] | 91.1 | 98.6 | 99.6 | 90.6 | 70.4 | 66.5 | 86.1 |
ETD[44] | 92.1 | 100.0 | 100.0 | 88.0 | 71.0 | 67.8 | 86.2 |
JBL[54] | 91.2 | 97.6 | 100.0 | 86.9 | 70.5 | 71.8 | 86.3 |
DFE-DA[46] | 88.3 | 99.4 | 100.0 | 87.6 | 74.3 | 73.1 | 86.9 |
DMDA[47] | 91.6 | 98.6 | 99.4 | 90.0 | 73.8 | 74.0 | 87.7 |
BSWD[55] | 90.1 | 99.0 | 100.0 | 89.0 | 75.9 | 72.8 | 87.8 |
Baseline | 86.7 | 98.2 | 99.4 | 86.6 | 70.4 | 68.3 | 84.9 |
Ours | 93.0 | 99.8 | 100.0 | 91.0 | 75.6 | 73.7 | 88.9 |
Lw | MMD loss | Dynamic factor | fixed parameter | A→D | D→W | W→A |
× | √ | × | 5:5 | 86.6 | 98.2 | 68.3 |
√ | × | × | 5:5 | 90.1 | 99.6 | 73.1 |
× | √ | √ | - | 88.6 | 99.2 | 72.8 |
√ | × | √ | - | 91.0 | 99.8 | 73.7 |
weight | 1: 9 | 3: 7 | 5: 5 | 7: 3 | 9: 1 | α:(1−α) |
Accuracy | 90.3 | 89.4 | 90.1 | 90.6 | 88.6 | 91.0 |
Methods | A→C | A→P | A→R | C→A | C→P | C→R | P→A | P→C | P→R | R→A | R→C | R→P | Avg |
ResNet-50[41] | 34.9 | 50.0 | 58.0 | 37.4 | 41.9 | 46.2 | 38.5 | 31.2 | 60.4 | 53.9 | 41.2 | 59.9 | 46.1 |
WDGRL[42] | 44.1 | 63.8 | 74.0 | 47.3 | 57.1 | 61.7 | 51.8 | 39.1 | 72.1 | 64.9 | 45.9 | 76.5 | 58.2 |
JAN[31] | 45.9 | 61.2 | 68.9 | 50.4 | 59.7 | 61.0 | 45.8 | 43.4 | 70.3 | 63.9 | 52.4 | 76.8 | 58.3 |
DANN[2] | 45.6 | 59.3 | 70.1 | 47.0 | 58.5 | 60.9 | 46.1 | 43.7 | 68.5 | 63.2 | 51.8 | 76.8 | 57.6 |
MCD[5] | 48.9 | 68.3 | 74.6 | 61.3 | 67.6 | 68.8 | 57.0 | 47.1 | 75.1 | 69.1 | 52.2 | 79.6 | 64.1 |
HAFN[43] | 50.2 | 70.1 | 76.6 | 61.1 | 68.0 | 70.7 | 59.5 | 48.4 | 77.3 | 69.4 | 53.0 | 80.2 | 65.4 |
ETD[44] | 51.3 | 71.9 | 85.7 | 57.6 | 69.2 | 73.7 | 57.8 | 51.2 | 79.3 | 70.2 | 57.5 | 82.1 | 67.3 |
FGDA[45] | 51.8 | 72.0 | 79.2 | 62.7 | 72.0 | 73.9 | 60.0 | 49.7 | 79.2 | 69.5 | 56.8 | 82.3 | 67.4 |
DFE-DA[46] | 56.4 | 74.9 | 78.2 | 62.8 | 72.3 | 73.2 | 62.3 | 53.8 | 80.4 | 72.4 | 60.5 | 82.8 | 69.2 |
ETA[44] | 51.3 | 71.9 | 85.7 | 57.6 | 69.2 | 73.7 | 57.8 | 51.2 | 79.3 | 70.2 | 57.5 | 82.1 | 67.3 |
DMDA[47] | 55.7 | 75.9 | 78.6 | 56.7 | 76.1 | 73.9 | 57.9 | 51.2 | 79.2 | 66.8 | 57.6 | 82.6 | 67.7 |
BuresNet[42] | 54.7 | 74.4 | 77.1 | 63.7 | 72.2 | 71.8 | 64.1 | 51.7 | 78.4 | 73.1 | 58.0 | 82.4 | 68.5 |
AEGDM[48] | 56.0 | 70.7 | 79.8 | 64.6 | 73.1 | 71.6 | 61.2 | 53.1 | 79.1 | 71.6 | 59.7 | 84.3 | 68.8 |
Baseline | 55.3 | 70.9 | 76.2 | 62.1 | 69.8 | 75.6 | 60.8 | 53.2 | 78.4 | 69.3 | 58.6 | 80.7 | 67.6 |
Ours | 58.7 | 75.6 | 79.5 | 64.8 | 75.3 | 76.7 | 65.5 | 56.7 | 81.2 | 74.6 | 59.8 | 86.0 | 71.2 |
Methods | I→P | P→I | I→C | C→I | C→P | P→C | Avg |
ResNet-50[41] | 74.8 | 83.9 | 91.5 | 78.0 | 65.5 | 91.2 | 80.7 |
WDGRL[42] | 76.8 | 87.0 | 91.7 | 87.2 | 75.2 | 90.3 | 84.7 |
MCD[5] | 77.3 | 89.2 | 92.7 | 88.2 | 71.0 | 92.3 | 85.1 |
JAN[31] | 76.8 | 88.0 | 94.7 | 89.5 | 74.2 | 91.7 | 85.8 |
DANN[2] | 75.0 | 86.0 | 96.2 | 84.0 | 74.3 | 91.5 | 85.0 |
CAT[49] | 76.7 | 89.0 | 94.5 | 89.8 | 74.0 | 93.7 | 86.3 |
AEGDM[48] | 81.4 | 93.2 | 98.1 | 92.1 | 78.1 | 96.5 | 89.9 |
IWCA[50] | 77.5 | 91.3 | 97.0 | 90.5 | 75.8 | 95.3 | 87.9 |
MEDM-LS[51] | 78.2 | 93.3 | 97.2 | 93.0 | 78.3 | 95.5 | 89.3 |
CGDM[9] | 78.7 | 93.3 | 97.5 | 92.7 | 79.2 | 95.7 | 89.5 |
Baseline | 80.6 | 89.4 | 94.9 | 89.3 | 75.4 | 92.6 | 87.0 |
Ours | 83.0 | 94.5 | 98.0 | 92.5 | 78.0 | 96.3 | 90.4 |
Methods | A→W | D→W | W→D | A→D | D→A | W→A | Avg |
ResNet-50[41] | 68.4 | 96.7 | 99.3 | 68.9 | 62.5 | 60.7 | 76.1 |
DAN[21] | 80.5 | 97.1 | 99.6 | 78.6 | 63.6 | 62.8 | 80.4 |
WDGRL[42] | 72.6 | 97.1 | 99.2 | 79.5 | 63.7 | 59.5 | 78.6 |
DANN[2] | 82.6 | 96.9 | 99.3 | 81.5 | 68.4 | 67.5 | 82.7 |
ADDA[52] | 86.2 | 96.2 | 98.4 | 77.8 | 69.5 | 68.9 | 82.9 |
MADA[53] | 90.0 | 97.4 | 99.6 | 87.8 | 70.3 | 66.4 | 85.2 |
CAT[49] | 91.1 | 98.6 | 99.6 | 90.6 | 70.4 | 66.5 | 86.1 |
ETD[44] | 92.1 | 100.0 | 100.0 | 88.0 | 71.0 | 67.8 | 86.2 |
JBL[54] | 91.2 | 97.6 | 100.0 | 86.9 | 70.5 | 71.8 | 86.3 |
DFE-DA[46] | 88.3 | 99.4 | 100.0 | 87.6 | 74.3 | 73.1 | 86.9 |
DMDA[47] | 91.6 | 98.6 | 99.4 | 90.0 | 73.8 | 74.0 | 87.7 |
BSWD[55] | 90.1 | 99.0 | 100.0 | 89.0 | 75.9 | 72.8 | 87.8 |
Baseline | 86.7 | 98.2 | 99.4 | 86.6 | 70.4 | 68.3 | 84.9 |
Ours | 93.0 | 99.8 | 100.0 | 91.0 | 75.6 | 73.7 | 88.9 |
Methods | I→P |
ResNet-50[41] | 74.8 |
WDGRL[42] | 76.8 |
MCD[5] | 77.3 |
DANN[2] | 75.0 |
TCM[56] | 75.8 |
CGDM[9] | 82.3 |
Baseline | 75.9 |
Ours | 83.0 |
Lw | MMD loss | Dynamic factor | fixed parameter | A→D | D→W | W→A |
× | √ | × | 5:5 | 86.6 | 98.2 | 68.3 |
√ | × | × | 5:5 | 90.1 | 99.6 | 73.1 |
× | √ | √ | - | 88.6 | 99.2 | 72.8 |
√ | × | √ | - | 91.0 | 99.8 | 73.7 |
weight | 1: 9 | 3: 7 | 5: 5 | 7: 3 | 9: 1 | α:(1−α) |
Accuracy | 90.3 | 89.4 | 90.1 | 90.6 | 88.6 | 91.0 |