Loading [MathJax]/jax/output/SVG/jax.js
Review

A comprehensive review of graph convolutional networks: approaches and applications

  • Convolutional neural networks (CNNs) utilize local translation invariance in the Euclidean domain and have remarkable achievements in computer vision tasks. However, there are many data types with non-Euclidean structures, such as social networks, chemical molecules, knowledge graphs, etc., which are crucial to real-world applications. The graph convolutional neural network (GCN), as a derivative of CNNs for non-Euclidean data, was established for non-Euclidean graph data. In this paper, we mainly survey the progress of GCNs and introduce in detail several basic models based on GCNs. First, we review the challenges in building GCNs, including large-scale graph data, directed graphs and multi-scale graph tasks. Also, we briefly discuss some applications of GCNs, including computer vision, transportation networks and other fields. Furthermore, we point out some open issues and highlight some future research trends for GCNs.

    Citation: Xinzheng Xu, Xiaoyang Zhao, Meng Wei, Zhongnian Li. A comprehensive review of graph convolutional networks: approaches and applications[J]. Electronic Research Archive, 2023, 31(7): 4185-4215. doi: 10.3934/era.2023213

    Related Papers:

    [1] Chunkai Zhang, Yingyang Chen, Ao Yin, Xuan Wang . Anomaly detection in ECG based on trend symbolic aggregate approximation. Mathematical Biosciences and Engineering, 2019, 16(4): 2154-2167. doi: 10.3934/mbe.2019105
    [2] Muhammad Firdaus, Siwan Noh, Zhuohao Qian, Harashta Tatimma Larasati, Kyung-Hyune Rhee . Personalized federated learning for heterogeneous data: A distributed edge clustering approach. Mathematical Biosciences and Engineering, 2023, 20(6): 10725-10740. doi: 10.3934/mbe.2023475
    [3] Kefeng Fan, Cun Xu, Xuguang Cao, Kaijie Jiao, Wei Mo . Tri-branch feature pyramid network based on federated particle swarm optimization for polyp segmentation. Mathematical Biosciences and Engineering, 2024, 21(1): 1610-1624. doi: 10.3934/mbe.2024070
    [4] Songfeng Liu, Jinyan Wang, Wenliang Zhang . Federated personalized random forest for human activity recognition. Mathematical Biosciences and Engineering, 2022, 19(1): 953-971. doi: 10.3934/mbe.2022044
    [5] M Kumaresan, M Senthil Kumar, Nehal Muthukumar . Analysis of mobility based COVID-19 epidemic model using Federated Multitask Learning. Mathematical Biosciences and Engineering, 2022, 19(10): 9983-10005. doi: 10.3934/mbe.2022466
    [6] Tao Wang, Min Qiu . A visual transformer-based smart textual extraction method for financial invoices. Mathematical Biosciences and Engineering, 2023, 20(10): 18630-18649. doi: 10.3934/mbe.2023826
    [7] Jiyang Yu, Baicheng Pan, Shanshan Yu, Man-Fai Leung . Robust capped norm dual hyper-graph regularized non-negative matrix tri-factorization. Mathematical Biosciences and Engineering, 2023, 20(7): 12486-12509. doi: 10.3934/mbe.2023556
    [8] Jianzhong Peng, Wei Zhu, Qiaokang Liang, Zhengwei Li, Maoying Lu, Wei Sun, Yaonan Wang . Defect detection in code characters with complex backgrounds based on BBE. Mathematical Biosciences and Engineering, 2021, 18(4): 3755-3780. doi: 10.3934/mbe.2021189
    [9] Shubashini Velu . An efficient, lightweight MobileNetV2-based fine-tuned model for COVID-19 detection using chest X-ray images. Mathematical Biosciences and Engineering, 2023, 20(5): 8400-8427. doi: 10.3934/mbe.2023368
    [10] Michael James Horry, Subrata Chakraborty, Biswajeet Pradhan, Maryam Fallahpoor, Hossein Chegeni, Manoranjan Paul . Factors determining generalization in deep learning models for scoring COVID-CT images. Mathematical Biosciences and Engineering, 2021, 18(6): 9264-9293. doi: 10.3934/mbe.2021456
  • Convolutional neural networks (CNNs) utilize local translation invariance in the Euclidean domain and have remarkable achievements in computer vision tasks. However, there are many data types with non-Euclidean structures, such as social networks, chemical molecules, knowledge graphs, etc., which are crucial to real-world applications. The graph convolutional neural network (GCN), as a derivative of CNNs for non-Euclidean data, was established for non-Euclidean graph data. In this paper, we mainly survey the progress of GCNs and introduce in detail several basic models based on GCNs. First, we review the challenges in building GCNs, including large-scale graph data, directed graphs and multi-scale graph tasks. Also, we briefly discuss some applications of GCNs, including computer vision, transportation networks and other fields. Furthermore, we point out some open issues and highlight some future research trends for GCNs.



    In recent years, the abundance of data generated from many distributed devices with the popularity of smartphones, wearable devices, intelligent home appliances, and autonomous driving. These data are usually concentrated in the data center for effective use. However, a crucial issue arises that the concentrated data store causes leakage of personal privacy [1]. Simultaneously, as the computing power of these mobile devices increases, it is attractive to store data locally while completing related computing tasks. Federated learning is a distributed machine learning framework that allows multiple parties to collaboratively train a model without sharing raw data [2,3], which has attracted significant attention from industry and academia recently. [4] summarizes and discusses in the application of federated learning in big data and its future direction. Although federated learning has essential significance and advantages in protecting user privacy, it also faces many challenges.

    First of all, due to the distributed nature of federated learning, it is vulnerable to Byzan- tine attacks. Notably, it has been shown that, with just one Byzantine client, the whole federated optimization algorithm can be compromised and fail to converge [5]. Especially when the training data is not independent and identically distributed (non-iid), the difficulty of defense against Byzantine attacks is increased and it is difficult to guarantee the convergence of the model [6].

    Methods for defending against Byzantine attacks in federated learning have been exten- sively studied, including coordinate-wise trimmed mean [9], the coordinate- wise median [7,8], the geometric median [10,11], and distance-based methods Krum [12], BREA [6], Bulyan [5]. In addition to the above methods based on statistical knowledge, [14] proposes a new idea based on anomaly detection to complete the detection of Byzantine clients in the learning process. [13] discusses the challenges and future directions of federated learning in real-time scenarios in terms of cybersecurity.

    The above methods can effectively defend against Byzantine attacks to some extent, but there are also some limitations. First, the methods based on statistical knowledge have high computational complexity, and also their defense abilities are weakened due to the non-iid data in federated learning. Second, for the anomaly detection algorithm [14], there is a premise that the detection model should be trained on the test data set. Obviously, the premise hypothesis cannot be realized in practical applications because it is difficult for us to get such a data set, which can cover almost all data distributions. Therefore, it necessary for the anomaly detection model to get pre-training without relying on test dataset and update dynamically on non-iid data.

    In this paper, we propose a new method that each client needs to share some data with the server, which makes a trade-off between client privacy and model performance. Unlike FedAvg [2], we use credibility score as the weight of model aggregation, not the sample size. The credibility score of each client is obtained by integrating the verification score and the detection score. The former is calculated by sharing data.

    The main contributions of this paper are:

    ▪ We propose a new federated learning framework (BRCA) which combines credibility assessment and unified update. BRCA not only effectively defends against Byzantine attacks, but also reduces the impact of non-iid data on the aggregated global model.

    ▪ The credibility assessment combing anomaly detection and data verification effectively detects Byzantine attacks on non-iid data.

    ▪ By incorporating an adaptive mechanism and transfer learning into the anomaly detection model, the anomaly detection model can dynamically improve detection performance. Moreover, its pre-training no longer relies on the test data set.

    ▪ We customize four different data distributions for each data set, and explore the influence of data distribution on defense methods against Byzantine attacks.

    FedAvg is firstly proposed in [2] as an aggregation algorithm for federated learning. The server updates the global model by a weighted average of the clients' model updates, and the aggregation weight is determined based on its data sample size. Stich [15] and Woodworth et al. [16] analyze the convergence of FedAvg on strongly-convex smooth loss functions. However, they assume that the data is iid, which is not suitable for federated learning [17,18]. And Li et al. [19] makes the first convergence analysis of FedAvg when the data is non-iid. [20] uses clustering to improve federated learning in non-iid data. Regrettably, the ability of naive FedAvg is very weak to resist Byzantine attacks.

    In the iterative process of federated aggregation, honest clients send the true model updates to the server, wishing to train a global model by consolidating their private data. However, Byzantine clients attempt to perturb the optimization process [21]. Byzantine attacks may be caused by some data corruption events in the computing or communication process such as software crashes, hardware failures and transmission errors. Simultaneously, they may also be caused by malicious clients through actively transmitting error information, in order to mislead the learning process [21].

    Byzantine-robust federated learning has received increasing attention in recent years. Krum [12] is designed specially to defend Byzantine attacks in the federated learning. Krum generate the global model by a client's model update whose distances to its neighbors is shortest. GeoMed [10] uses the geometric median which is a variant of the median from one dimension to multiple dimensions. Unlike the Krum, the GeoMed uses all client updates to generate a new global model, not just one client update. Trimmed Mean [9] proposes that each dimension of its global model is obtained by averaging the parameters of clients' model updates in that dimension. But before calculating the average, the largest and smallest part of the parameters in that dimension are deleted, Xie et al. [22] and Mhamdi et al. [5] are all its variants. BREA [6] also considers the security of information transmission, but its defense method is still based on distance calculation. Zero [23] based on Watermark detection approach detect attacks such as malware and phishing attacks and cryptojacking. [24] surveys intrusion detection techniques in mobile cloud computing environment.

    All of the above defense methods based on statistical knowledge and distance are not effective in defending against Byzantine attacks in non-iid settings. Abnormal [25] uses an anomaly detection model to complete the detection of Byzantine attacks.

    The concept of independent and identically distributed (iid) of data is clear, but there are many meanings of non-iid. In this work, we only consider label distribution skew [17]. The categories of samples may vary across clients. For example, in the face recognition task, each user generally has their face data; for mobile device, some users may use emojis that do not show up in others' devices.

    We summarize the contributions and limitations of the existing works in Table 1.

    Table 1.  The summary of the contributions and limitations of the related papers.
    Reference Contributions Limitations
    [12] [10] [9]
    [5] [22]
    Krum, GeoMed and Trimmed Mean complete the Byzantine defense based on statistical knowledge. Easy to deploy applications. The assumption is that the data of the clients is iid. High computational complexity.
    [25] The auto-encoder anomaly detection model is firstly applied to detect Byzantine attacks. The pre-training of the anomaly detection model is completed on test dataset. The anomaly detection model is static.
    [6] Cryptography is used to protect the security of information transmitted between clients and server. Defense against Byzantine attacks is still based on distance to find outliers, and had limited defenses capabilities.

     | Show Table
    DownLoad: CSV

    In this paper, we propose a method that combine credibility assessment and unified update to robust federated learning against Byzantine attacks on non-iid data.

    We utilize a federated setting that one server communicates with many clients. For the rest of the paper, we will use the following symbol definitions: A is the total client set, |A| = n; S is the selected client set in every iteration, |S| = k; among them, B is Byzantine client set, |B| = b, and H is honest client set, |H|=h. wti is the model update sent by the client i to the server at round t, Byzantine attack rate ξ=bkwt is the global model at round t, DP = {D, ..., Dn} is clients' private data, Ds = {Ds, ..., Ds } is the clients' shared data, and data-sharing rate γ = |Ds||DP|+|Ds| (|| represents the sample size of the data set).

    In order to enhance the robustness of federated learning against Byzantines attacks on non-iid data, BRCA combines credibility assessment and unified update, Figure 1 depicts the architecture of BRCA.

    Figure 1.  The frame diagram of the BRCA.

    Before training, each client needs to share some private data to the server. In each iteration, the server randomly selects some clients and sends the latest global model to them. These clients use their private data to train the model locally and send the model updates to the server. After receiving model updates, the server conducts a credibility assessment for each model update and calculates their credibility scores. Momentum is an effective measure to improve the ability of federated learning to resist Byzantine attacks [26]. So our aggregation Eq (1) is as follow:

    wt+1=αWt+(1a)isrtiwti (1)

    where rti is the credibility score of client i at round t and α (0 < α < 1) is a decay factor. Last, unified update uses shared data to update the primary global model to get the new global model for this round

    Algorithm 1 is the description of BRCA, which contains Credibility Assessment in line 22, and Unified Update in line 28. The crucial of BRCA to defend against Byzantine attacks is credibility assessment. On non-iid data, the data distributions of different clients are immense, and it is difficult to judge whether the difference is caused by Byzantine attacks or the non-iid data. However, the model update of the honest client should have a positive effect on its private data, which is not affected by other clients. Simultaneously, anomaly detection model can effectively detect Byzantine attacks [25]. Thus, we combine the above two ideas to detect Byzantine attacks. In order to solve the shortcomings of the existing anomaly detection models, we propose an adaptive anomaly detection model. In this paper, the shared data is randomly selected by each client based on the sample category. Of course, other sampling methods could also be used, such as clustering. In addition, it must be pointed out that the shared data will only be used on the server, not on the clients. That effectively protect the clients' privacy.

        Algorithm 1: BRCA
      Input: total clients A; total number of iterations T; learning rate ηserver, ηclient, ηdetection; Byzantine attack rate ξ; epoch Eserver, Eclient; initial global model w0; clients' private data DP={DP1,,DPN}; clients' shared data
    Ds = {Ds1, ..., DSn }; initial anomaly detection model θ0; β; α; d; k
      Output: global model WT+1, anomaly detection model θT+1
    1  R = : the credibility score set.
    2  H = : the honest client set.
    3  Function Add Attack(w):

     | Show Table
    DownLoad: CSV

    To summarize, BRCA has five steps. First: the server pre-train an anomaly detection model by source data and initialize a global model. Second: every client share little private data with the server. Three: every client download the newest global model from the server, and complete model updates by private data. Then, every client send the model update to the server. Four: the server update the global model and complete the adaptation of the anomaly detection model by model updates from clients. Five: the server update the primary global model with unified update, after that, the new global model is completed. Repeating steps three to five until the global model converges

    Our work is different from the recent state of the art. First, Krum, GeoMed and TrimmedMean are the representative methods based on geometric knowledge, but their premise is that the data of clients is dependent and identically distributed (iid). The hypothesis of our method is based on the actual application background of FL, aiming at non-iid data. Second, Abnormal is the first method to detect Byzantine attacks by auto-encoder anomaly detection model. However, the training of the anomaly detection model in the method is based on the test dataset and the abnormal detection model in the method is static. For both of the problems, our method has made improvement: 1) we pre-train the anomaly detection model with related but different source data without relying on the test dataset. 2) we introduce adaptive mechanism to the anomaly detection model, which help the detection model get update during federated iteration dynamically.

    Algorithm 2(Credibility Assessment) is the key part of BRCA, which assigns a credibility score for each client model update. A Byzantine client would be given much lower credibility score than an honest client. To guarantee the accuracy of the credibility score, Credibility Assessment integrates adaptive anomaly detection model and data verification.

        Algorithm 2: Credibility Assessment
      Input: local model updates Q; clients' shared data Ds={Ds1,,Dsn}; anomaly
    detection model θt; β; selected clients S; ηdetection; d; k
      Output: credibility score of clients R; honest client set H; anomaly detection model θt+1
    1  R = : credibility score set; H = : the honest client set; sum = 0; sume = 0; sumf = 0
    2  C = {Ct1,,Cti,Ctk}, client i S, cti is the weight of the last convolutional layer of Wti
    3  for each clienti S do

     | Show Table
    DownLoad: CSV

    In Algorithm 2, line 4 is the data verification, which calculates the verification score fi for the model update of client i. And line 5 is the get-anomaly-score() of the adaptive anomaly detection model, which calculates detection score ei. Subsequently, the credibility ri of the model update is ri = βei +(1β)fi, R={r1, ... ri..., rk}, client iS. The make-adaption () in line 24 implements the adaption of the anomaly detection model.

    In this paper, we judge the model update with a credibility score lower than the mean of R as a Byzantine attack, and set its credibility score as zero. Finally, normalizing the scores to get the final credibility scores.

    In the training process, we cannot predict the type of attacks, but we can estimate the model update of the honest client. Therefore, we can adopt a one-class classification algorithm to build the anomaly detection model with normal model updates. Such technique will learn the distribution boundary of the model updates to determine whether the new sample is abnormal. Auto-encoder is an effective one-class learning model for detecting anomalies, especially for high-dimensional data [27].

    In practical applications, we cannot get the target data to complete the pre-training of our anomaly detection model. Therefore, the initialized anomaly detection model will be pre-trained on the source data with the idea of transfer learning.

    At round t, the detection score eti of client i:

    eti=exp(Mse(Ctiθt(Cti))μ(E)σ(E))) (2)

    Our anomaly detection model is different from the one in Abnormal: 1) Abnormal uses the test set of the data set to train the anomaly detection model. Although the detection model obtained can complete the detection task very well, in most cases the test data set is not available. Therefore, based on the idea of transfer learning, we complete the pre-training of the anomaly detection model in the source domain. 2) Abnormal 's anomaly detection model will not be updated after training on the test set. We think this is unreasonable, because the test set is only a tiny part of the overall data. Using a small part of the training data to detect most of the remaining data, and the result may not be accurate enough. Therefore, pre-training of the anomaly detection model is completed in the source domain. Then we use the data of the target domain to fine-tune it in the iterative process to update the anomaly detection model dynamically, as make-adaption shown in Algorithm 3.

        Algorithm 3: AADM adaptive anomaly detection model
      Input: anomaly detection model θt; weights of the last convolutional layer of the local model C; ηdetection; credibility score R; honest client set H; d; k
      Output: updated anomaly detection model θt+1
    1  Function get-anomaly-score (θt,Cti):

     | Show Table
    DownLoad: CSV

    The non-iid of client data increases the difficulty of Byzantine defense. However, the performance of the updated model of each client on its shared data is not affected by other clients, which can be effectively solved this problem. Therefore, we use the clients' shared data {DS=Ds1,Dsi,,Dsk} client iS to calculate the verification score of their updated model:

    fti=(exp(ltiμ(l)σ(L)))2 (3)

    where lti is loss of client i calculated on model wti using the shared data Dsi at round t:

    lti=1|Dsi||Dsi|j=0l(Ds(j)i,Wti) (4)

    where Ds(j)i is the jth sample of Dsi and μ(L), σ(L) are the mean and variance of set L={l1,,lk} respectively.

    After getting the credibility score rkt in Algrithm 2 with the anomaly score ekt and the verification score fkt, we can complete the aggregation of the clients' local model updates in Eq (1) and get a preliminary updated global model. However, due to the non-iid of client data, the knowledge learned by the local model of each client is limited, and the model differences between two clients are also significant. Therefore, to solve the problem that the preliminary aggregation model lacks a clear and consistent goal, we introduce an additional unified update procedure with shared data on server, details can be seen in Algorithm 4.

      Algorithm 4: Unified update
      Input: global model wt+1; clients' shared data Ds = {Ds1,,Dsn}; Eserver; ηserver; honest client set H
      Output: global model wt+1.
    1 for each epoch e = 0 to Eserver do

     | Show Table
    DownLoad: CSV

    Because the data used for the unified update is composed of each client's data, it can more comprehensively cover the distribution of the overall data. The goal and direction of the unified update are based on the overall situation and will not tend to individual data distribution.

    To verify the effectiveness of BRCA, we structure the client's data into varying degrees of non-iid, and explore the impact of different amounts of shared data on the global model. At the same time, we also compare the performance of our anomaly detection model with the Abnormal 's and explore the necessity of unified update.

    Mnist and Cifar10 are the two most commonly used public data sets in image classification, and most of the benchmark methods in our work also use these two data sets for experiments. Using these two data sets, it is easier to compare with other existing methods.

    We do the experiments on Mnist and Cifar10, and customize four different data distributions: (a) non-iid-1: each client only has one class of data. (b) non-iid-2: each client has 2 classes of data. (c) non-iid-3: each client has 5 classes of data. (d) iid: each client has 10 classes of data.

    For Mnist, using 100 clients and four data distributions: (a) non-iid-1: each class of data in the training dataset is divided into 10 pieces, and each client selects one piece as its private data. (b) non-iid-2: each class of data in the training dataset is divided into 20 pieces, and each client selects 2 pieces of different classes of the data. (c) non-iid-3 each class of data in the training dataset is divided into 50 pieces, and each client selects 5 pieces of different classes of the data. (d) iid: each class of data in the training dataset is divided into 100 pieces, and each client selects 10 pieces of different classes of the data. As for the source domains used for the pre-training of the anomaly detection model, we randomly select 20,000 lowercase letters in the Nist dataset.

    For Cifar10, there are 10 clients and the configuration of four data distributions is similar to that of the Mnist. We select some classes of data in Cifar100 as source domain, which are as follows: lamp (number:40), lawn mower (41), lobster (45), man (46), forest (47), mountain (49), girl (35), Snake (78), Rose (70) and Tao (68), these samples do not exiting in Cifar10.

    We use logistic regression on Mnist dataset. ηserver = 0.1, ηclient = 0.1, ηdetection = 0.02, Eclient = 5, Eserver = 1, n = 100, k = 30, ξ = 20%. Two convolution layers and three fully connected layer on Cifar10, ηserver = 0.05, ηclient = 0.05, ηdetection = 0.002, Eclient = 10, Eserver = 10, n = 10, k = 10, ξ = 20%. The structure of models are the same as [10].

    Same-value attacks: A Byzantine client i sends the model update ωi = c1 to the server (1 is all-ones vectors, c is a constant), we set c = 5. Sign-flipping attacks: In this scenario, each client i computes its true model update ωi, then Byzantine clients send ωi = a ωi (a < 0) to the server, we set a = −5. Gaussian attacks: Byzantine clients add Gaussian noise to all the dimensions of the model update ωi = ωi +ϵ, where s follows Gaussian distribution N (0, g2) where g is the variance, we set g = 0.3.

    Defenses: Krum, GeoMed, Trimmed Mean, Abnormal and No Defense. No Defense does not use any defense methods.

    In the first experiment, we test the influence of the shared data rate γ in our algorithm, and do the experiment with the data distribution of non-iid-2. We implement it on five different values [1, 3, 5, 7 and 10%]. Figures 2 and 3 are the accuracy and loss for Cifar10. It is found that: 1) In all cases of Byzantine attacks, our algorithm is superior to the three benchmark methods. 2) Only 1% of the data shared by the client can significantly improve the performance of the global model. For three Byzantine attacks, Krum, GeoMed, Trimmed Mean, No Defense are all unable to converge. This also shows that when the model is complex, such methods would be less able to resist Byzantine attacks.

    Figure 2.  The Accuracy of Cifar10. Byzantine attack types from (a) to (c) are as follows: Same value, Sign flipping and Gaussian noisy. Six defense methods are adopted for each type of attack, in order: No defense, Krum, GeoMed, Trimmed Mean, Abnormal and BRCA. For Ours, there are five different shared data rate (1, 3, 5, 7 and 10%), which correspond accordingly: BRCA 1, BRCA 3, BRCA 5, BRCA 7, BRCA 10.
    Figure 3.  The loss of Cifar10. The legends are the same as Figure 2.

    With the increase in the client data sharing ratio, the performance of the global model has become lower. When the client shares the data ratio from 1 to 10%, the average growth rate with the three Byzantine attacks are: 1.8→1.41→0.97→0.92%. The clients only share one percent of the data, and the performance of the global model can be greatly improved.

    Figure 4 clearly demonstrates the impact of different shared data rates on the loss value of the global model on Cifar10.

    Figure 4.  The loss of BRCA on Cifar10 with five different shared rate.

    In this part, the purposes of our experiment are: 1) Compare anomaly detection model between ours and Abnormal. 2) Explore the robustness of the anomaly detection model to data that are non-iid. The shared data rate γ is 5%, Sections 4.2.3 and 4.2.4 are the same.

    In order to compare the detection performance of the anomaly detection model against Byzantine attacks between BRCA and Abnormal, we use the cross-entropy loss as the evaluation metric which is calculated by the detection score. Firstly, we get detection scores E={e1, ..., ei, ..., ek} based on model update ωi and θ, client i∈S. Then, we set P=Sigmoid(Eμ(E)) represents the probability that the client is honest and 1 − P is the probability that the client is Byzantine. Lastly, we use P and true label Y (yi = 0, i∈ B and yi = 1, j∈ H) to calculate the cross-entropy loss l=Σki=1yiln(Pi)

    Figure 5(a)(c) compare the loss of the anomaly detection model between BRCA and the Abnormal. From the figures, we can see that our model has a greater loss than Abnormal in the initial stage, mainly due to the pre-training of the anomaly detection model using the transfer learning. The initial pre-trained anomaly detection model cannot be used well in the target domain. As the adaptation progress, the loss of our model becomes decreases and gradually outperforms the Abnormal. Although Abnormal has a low loss in the initial stage, as the training progresses, the loss gradually increases, and the detection ability becomes degenerate.

    Figure 5.  the cross-entropy loss of our and Abnormal anomaly detection model, on Cifar10 with non-iid-2. (a)–(c) are the performance for three Byzantine-attacks.

    Figure 6(a)(c) show the influence of different data distributions on our detection model. For different data distributions, the detection ability of the model is different, but it is worth pointing out that: as the degree of non-iid of the data increases, the detection ability of the model also increases.

    Figure 6.  (a)–(c) are our anomaly detection model's performance on four different data distribution (iid, non-iid-1, non-iid-2, non-iid-3) against Byzantine attacks (Gaussian noisy, sign flipping, same value).

    In this part, we study the impact of the unified update on the global model. Figure 7 shows the accuracy of the global model with and without unified update on Cifar10.

    Figure 7.  The accuracy of BRCA and BRCA No on Cifar10. BRAC No is based on BRCA with unified update removed.

    From non-iid-1 to iid, the improvement of the global model's accuracy by unified update is as follows: 35.1→13.6→4.7→2.3% (Same value), 34.8→10.5→3.0→3.1% (Gaussian noisy), 24.9→9.9→2.8→3.0% (Sign flipping). Combined with Figure 7, it can be clearly found that the more simple the client data is, the more obvious the unified update will be to the improvement of the global model.

    When the data is non-iid, the directions of the model updates between clients are different. The higher the degree of non-iid of data, the more significant the difference. The global model obtained by weighted aggregation does not fit well with the global data. Unified update on the shared data can effectively integrate the model updates of multiple clients, giving the global model a consistent direction.

    Therefore, it is necessary to implement a unified update to the primary aggregation model when data is non-iid.

    Tables 2 and 3 show the accuracy and loss of each defense method under different data distributions on Cifar10. It can be seen that our method is the best, and the performance is relatively stable for different data distributions. The higher the degree of non-iid of data, the more single the data of each client, the lower the performance of the defense method.

    Table 2.  The accuracy of the six defenses under four different data distributions on Cifar10, against three attacks.
    Attacks No Krum GeoMed Abnormal TrimmedMean BRCA
    Same value Non-iid-1 0.1 0.1 0.1 0.178 0.1 0.529
    Non-iid-2 0.101 0.207 0.205 0.480 0.1 0.619
    Non-iid-3 0.1 0.398 0.398 0.634 0.1 0.691
    iid 0.098 0.696 0.705 0.698 0.101 0.713
    Gaussian noisy Non-iid-1 0.1 0.1 0.1 0.178 0.1 0.529
    Non-iid-2 0.191 0.204 0.205 0.513 0.059 0.623
    Non-iid-3 0.0409 0.398 0.394 0.660 0.171 0.692
    iid 0.1 0.697 0.694 0.710 0.120 0.715
    Sign flipping Non-iid-1 0.1 0.101 0.1 0.177 0.1 0.426
    Non-iid-2 0.1 0.192 0.214 0.5131 0.1 0.621
    Non-iid-3 0.1 0.397 0398 0.651 0.1 0.686
    iid 0.1 0.697 0.703 0.711 0.1 0.718

     | Show Table
    DownLoad: CSV
    Table 3.  The loss of the six defenses under four different data distributions on Cifar10, against three attacks.
    Attacks No Krum GeoMed Abnormal TrimmedMean BRCA
    Same value Non-iid-1 2.84e16 11.72 9.61 2.29 6.05e17 2.09
    Non-iid-2 6.99e16 7.29 8.01 2.06 3.63e16 2.09
    Non-iid-3 4.48e16 2.35 2.38 1.893 3.37e16 0.691
    iid 1.51e16 0.794 0.774 1.837 3.17e16 1.79
    Gaussian noisy Non-iid-1 8.635e4 8.41 9.37 2.29 936.17 1.54
    Non-iid-2 9.51 7.57 8.37 1.34 7.98 0.623
    Non-iid-3 8.22 2.01 2.31 0.94 6.07 0.692
    iid 8.09 0.81 0.79 0.82 3.12 0.76
    Sign flipping Non-iid-1 2.30 10.72 9.91 2.29 2.30 1.54
    Non-iid-2 2.31 7.77 7.10 1.34 2.30 0.621
    Non-iid-3 2.31 2.36 2.13 0.94 2.30 0.686
    iid 2.31 0.79 0.80 0.81 2.31 0.76

     | Show Table
    DownLoad: CSV

    Our analysis is as follows: 1) The non-iid of data among clients causes large differences between clients' models. And it is difficult for the defense method to judge whether the anomaly is caused by the non-iid of the data or by the Byzantine attacks, which increases the difficulty of defending the Byzantine attack. 2) Krum and GeoMed use statistical knowledge to select the median or individual client's model to represent the global model. This type of method can effectively defend against Byzantine attacks when the data is iid. However when the data is non-iid, each client's model only focuses on a smaller area, and its independence is high, cannot cover the domain of other clients, and obviously cannot represent the global model. 3) Trimmed Mean is based on the idea of averaging to defend against Byzantine attacks. When the parameter dimension of the model is low, it has a good performance. But as the complexity of the model increases, the method can not stably complete convergence.

    In this work, we propose a robust federated learning framework against Byzantine attacks when the data is non-iid. BRCA detects Byzantine attacks by credibility assessment. Meanwhile, it makes the unified updating of the global model on the shared data, so that the global model has a consistent direction and its performance is improved. BRCA can make the global model converge very well when facing different data distributions. And for the pre-training of anomaly detection models, transfer learning can help the anomaly detection model get rid of its dependence on the test data set. Experiments have proved that BRCA performs well both on non-iid and iid data, especially on non-iid data. In the future, we will improve our methods by studying how to protect the privacy and security of shared data.

    This work was partially supported by the Shanghai Science and Technology Innovation Action Plan under Grant 19511101300.

    All authors declare no conflicts of interest in this paper.



    [1] Z. Zhang, P. Cui, W. Zhu, Deep Learning on Graphs: A Survey, IEEE Trans. Knowl. Data Eng., 34 (2022), 249–270. https://doi.org/10.1109/TKDE.2020.2981333 doi: 10.1109/TKDE.2020.2981333
    [2] D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, P. Vandergheynst, The emerging field of signal processing on graphs: Extending high dimensional data analysis to networks and other irregular domains, IEEE Signal Process. Mag., 30 (2013), 83–98. https://doi.org/10.1109/MSP.2012.2235192 doi: 10.1109/MSP.2012.2235192
    [3] A. Sandryhaila, J. M. F. Moura, Big data analysis with signal processing on graphs: Representation and processing of massive data sets with irregular structure, IEEE Signal Process. Mag., 31 (2014), 80–90. https://doi.org/10.1109/MSP.2014.2329213 doi: 10.1109/MSP.2014.2329213
    [4] A. Sandryhaila, J. M. F. Moura, Discrete signal processing on graphs, IEEE Trans. Signal Process., 61 (2013), 1644–1656. https://doi.org/10.1109/TSP.2013.2238935 doi: 10.1109/TSP.2013.2238935
    [5] J. Bruna, W. Zaremba, A. Szlam, Y. Lecun, Spectral networks and locally connected networks on graphs, arXiv preprint, (2013), arXiv: 1312.6203. https://doi.org/10.48550/arXiv.1312.6203
    [6] D. Duvenaud, D. Maclaurin, J. Iparraguirre, R. Bombarell, T. Hirzel, A. Aspuru-Guzik, eet al., Convolutional networks on graphs for learning molecular fingerprints, Adv. Neural Inf. Process. Syst., 28 (2015), 2224–2232.
    [7] T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, arXiv preprint, (2016), arXiv: 1609.02907.
    [8] J. Atwood, D. Towsley, Diffusion-convolutional neural networks, Adv. Neural Inf. Process. Syst., 29 (2016), 1993–2001.
    [9] M. Defferrard, X. Bresson, P. Vandergheynst, Convolutional neural networks on graphs with fast localized spectral filtering, Adv. Neural Inf. Process. Syst., 29 (2016), 3837–3845.
    [10] R. Levie, F. Monti, X. Bresson, M. M. Bronstein, CayleyNets: Graph convolutional neural networks with complex rational spectral filters, IEEE Trans. Signal Process., 67 (2019), 97–109. https://doi.org/10.1109/TSP.2018.2879624 doi: 10.1109/TSP.2018.2879624
    [11] R. Levie, W. Huang, L. Bucci, M. Bronstein, G. Kutyniok, Transferability of Spectral Graph Convolutional Neural Networks, J. Mach. Learn. Res., 22 (2021), 12462–112520.
    [12] F. Monti, D. Boscaini, J. Masci, E. Rodola, J. Svoboda, M. M. Bronstein, Geometric deep learning on graphs and manifolds using mixture model CNNs, . IEEE Conf. Comput. Vis. Pattern Recognit., Honolulu, HI, USA, 2017, 5425–5434. https://doi.org/10.1109/CVPR.2017.576
    [13] M. Fey, J. E. Lenssen, F. Weichert, H. Müller, SplineCNN: fast geometric deep learning with continuous b-spline kernels, Proc. IEEE Conf. Comput. Vis. Pattern Recognit., (2018), 869–877. https://doi.org/10.1109/CVPR.2018.00097 doi: 10.1109/CVPR.2018.00097
    [14] W. L. Hamilton, Z. Ying, J. Leskovec, Inductive representation learning on large graphs, Adv. Neural Inf. Process. Syst., 30 (2017), 1024–1034.
    [15] Y. Zhao, J. Qi, Q. Liu, R. Zhang, WGCN: Graph Convolutional Networks with Weighted Structural Features, in 2021 SIGIR, (2021), 624–633. https://doi.org/10.1145/3404835.3462834
    [16] H. Wu, C. Wang, Y. Tyshetskiy, A. Docherty, K. Lu, L. Zhu, Adversarial examples for graph data: Deep insights into attack and defense, arXiv preprint, (2019), arXiv: 1903.01610. https://doi.org/10.48550/arXiv.1903.01610
    [17] D. Zügner, S. Günnemann, Adversarial attacks on graph neural networks via meta learning, arXiv preprint, (2019), arXiv: 1902.08412. https://doi.org/10.48550/arXiv.1902.08412
    [18] K. Xu, H. Chen, S. Liu, P. Chen, T. Weng, M. Hong, et al., Topology attack and defense for graph neural networks: An optimization perspective, in Proc. Int. Joint Conf. Artif. Intell., (2019), 3961–3967. https://doi.org/10.24963/ijcai.2019/550
    [19] L. Chen, J. Li, J. Peng, A survey of adversarial learning on graph, arXiv preprint, (2003), arXiv: 2003.05730. https://doi.org/10.48550/arXiv.2003.05730
    [20] L. Chen, J. Li, J. Peng, Y. Liu, Z. Zheng, C. Yang, Understanding Structural Vulnerability in Graph Convolutional Networks, in Proc. Int. Joint Conf. Artif. Intell., (2021), 2249–2255. https://doi.org/10.24963/ijcai.2021/310
    [21] P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio, Graph attention networks, arXiv preprint, (2017), arXiv: 1710.10903. https://doi.org/10.48550/arXiv.1710.10903
    [22] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit. L. Jones, A. N. Gomez, et al., Attention is all you need, Adv. Neural Inf. Process. Syst., 30 (2017), 5998–6008.
    [23] C. Zhuang, Q. Ma, Dual Graph Convolutional Networks for Graph-Based Semi-Supervised Classification, in Proc. Int. Conf. World Wide Web, (2018), 499–508. https://doi.org/10.1145/3178876.3186116
    [24] F. Hu, Y. Zhu, S. Wu, L. Wang, T. Tan, Hierarchical Graph Convolutional Networks for Semi-supervised Node Classification, in Proc. Int. Joint Conf. Artif. Intell., (2019), 4532–4539. https://doi.org/10.24963/ijcai.2019/630
    [25] Y. Zhang, S. Pal, M. Coates, D. Ü stebay, Bayesian graph convolutional neural networks for semi-supervised classification, in Proc. Int. Joint Conf. Artif. Intell., 33 (2019), 5829–5836. https://doi.org/10.1609/aaai.v33i01.33015829
    [26] Y. Luo, R. Ji, T. Guan, J. Yu, P. Liu, Y. Yang, Every node counts: Self-ensembling graph convolutional networks for semi-supervised learning, Pattern Recognit., 106 (2020), 107451. https://doi.org/10.1016/j.patcog.2020.107451 doi: 10.1016/j.patcog.2020.107451
    [27] P. Gong, L. Ai, Neighborhood Adaptive Graph Convolutional Network for Node Classification, IEEE Access, 7 (2019), 170578–170588. https://doi.org/10.1109/ACCESS.2019.2955487 doi: 10.1109/ACCESS.2019.2955487
    [28] I. Chami, Z. Ying, C. Ré, J. Leskovec, Hyperbolic graph convolutional neural networks, in Proc. Adv. Neural Inf. Process. Syst., (2019), 4868–4879.
    [29] J. Dai, Y. Wu, Z. Gao, Y. Jia, A Hyperbolic-to-Hyperbolic Graph Convolutional Network, in 2021 IEEE/CVF Conf. Computer Vision Pattern Recogn. (CVPR), (2021), 154–163. https://doi.org/10.1109/CVPR46437.2021.00022
    [30] S. Rhee, S. Seo, S. Kim, Hybrid Approach of Relation Network and Localized Graph Convolutional Filtering for Breast Cancer Subtype Classification, in 2018 Int. Joint Conf. Artif. Intell., (2018), 3527–3534. https://doi.org/10.24963/ijcai.2018/490
    [31] J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, G. E. Dahl, Neural Message Passing for Quantum Chemistry, in 2017 Int. Conf. Machine Learn., (2017), 1263–1272.
    [32] M. Zhang, Z. Cui, M. Neumann, Y. Chen, An End-to-End Deep Learning Architecture for Graph Classification, in Proc. Artif. Intell., (2018), 4438–4445. https://doi.org/10.1609/aaai.v32i1.11782
    [33] R. Ying, J. You, C. Morris, Hierarchical graph representation learning with differentiable pooling, in Proc. 32nd Int. Conf. Neural Inf. Process. Syst., (2018), 4805–4815.
    [34] Y. Ma, S. Wang, C. C Aggarwal, J. Tang, Graph convolutional networks with eigenpooling, in Proc. 25th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., (2019), 723–731. https://doi.org/10.1145/3292500.3330982
    [35] J. Lee, I. Lee, J. Kang, Self-attention graph pooling, in Proc. 36th Int. Conf. Machine Learn., (2019), 3734–3743. Available from: http://proceedings.mlr.press/v97/lee19c/lee19c.pdf
    [36] C. Cangea, P. Velickovic, N. Jovanovic, T. Kipf, P. Lio, Towards sparse hierarchical graph classifiers, in Proc. Adv. Neural Inf. Process. Syst., (2018). https://doi.org/10.48550/arXiv.1811.01287
    [37] H. Gao, S. Ji, Graph U-Nets, in Proc. 36th Int. Conf. Machine Learn., (2019), 2083–2092. https://doi.org/10.1109/TPAMI.2021.3081010
    [38] H. Gao, Z. Wang, S. Ji, Large-Scale Learnable Graph Convolutional Networks, in Proc. Knowl. Disc. Data Min., (2018), 1416–1424. https://doi.org/10.1145/3219819.3219947
    [39] W. Chiang, X. Liu, S. Si, Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks, in Proc. Knowl. Disc. Data Min., (2019), 257–266. https://doi.org/10.1145/3292500.3330925
    [40] D. Zou, Z. Hu, Y. Wang, S. Jiang, Y. Sun, Q. Gu, Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks, in Proc. Adv. Neural Inf. Process. Syst., (2019), 11249–11259.
    [41] J. Wang, Y. Wang, Z. Yang, Bi-GCN: Binary Graph Convolutional Network, in 2021 IEEE/CVF Conf. Comput. Vision Pattern Recogn. (CVPR), (2021), 1561–1570. https://doi.org/10.1109/CVPR46437.2021.00161
    [42] F. Monti, K. Otness, M. M. Bronstein, MOTIFNET: A Motif-Based Graph Convolutional Network for Directed Graphs, in Proc. IEEE Data Sci. Workshop, (2018), 225–228. https://doi.org/10.1109/DSW.2018.8439897
    [43] J. Du, S. Zhang, G. Wu, J. M. F. Moura, S. Kar, Topology adaptive graph convolutional networks, arXiv preprint, (2017), arXiv: 1710.10370.
    [44] E. Yu, Y. Wang, Y. Fu, D. B. Chen, M. Xie, Identifying critical nodes in complex networks via graph convolutional networks, Knowl.-Based Syst., 198 (2020), 105893. https://doi.org/10.1016/j.knosys.2020.105893 doi: 10.1016/j.knosys.2020.105893
    [45] C. Li, X. Qin, X. Xu, D. Yang, G. Wei, Scalable Graph Convolutional Networks with Fast Localized Spectral Filter for Directed Graphs, IEEE Access, 8 (2020), 105634–105644. https://doi.org/10.1109/ACCESS.2020.2999520 doi: 10.1109/ACCESS.2020.2999520
    [46] S. Abu-El-Haija, A. Kapoor, B. Perozzi, J. Lee, N-GCN: Multi-scale Graph Convolution for Semi-supervised Node Classification, in Proc. Conf. Uncertainty in Artif. Intell., (2019), 841–851.
    [47] S. Wan, C. Gong, P. Zhong, B. Du, L. Zhang, J. Yang, Multiscale Dynamic Graph Convolutional Network for Hyperspectral Image Classification, IEEE Trans. Geosci. Remote. Sens., 58 (2020), 3162–3177. https://doi.org/10.1109/TGRS.2019.2949180 doi: 10.1109/TGRS.2019.2949180
    [48] R. Liao, Z. Zhao, R. Urtasun, R. S. Zemel, LanczosNet: Multi-Scale Deep Graph Convolutional Networks, arXiv preprint., (2019), arXiv: 1901.01484. Available from: https://openreview.net/pdf?id = BkedznAqKQ
    [49] S. Luan, M. Zhao, X. Chang, D. Precup, Break the Ceiling: Stronger Multi-scale Deep Graph Convolutional Networks, in Proc. Conf. Workshop on Neural Inform. Process. Syst., 32 (2019), 10943–10953. Available from: https://proceedings.neurips.cc/paper_files/paper/2019/file/ccdf3864e2fa9089f9eca4fc7a48ea0a-Paper.pdf
    [50] F. Manessi, A. Rozza, M. Manzo, Dynamic Graph Convolutional Networks, Pattern Recogn., 97 (2020), 107000. https://doi.org/10.1016/j.patcog.2019.107000 doi: 10.1016/j.patcog.2019.107000
    [51] A. Pareja, G. Domeniconi, J. Chen, T. Ma, T. Suzumura, EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs, in Proc. Int. Joint Conf. Artif. Intell., (2020), 5363–5370. https://doi.org/10.1609/aaai.v34i04.5984
    [52] Z. Qiu, K. Qiu, J. Fu, D. Fu, DGCN: Dynamic Graph Convolutional Network for Efficient Multi-Person Pose Estimation, in Proc. Int. Joint Conf. Artif. Intell., (2020), 11924–11931. https://doi.org/10.1609/aaai.v34i07.6867
    [53] T. Song, Z. Cui, Y. Wang, W. Zheng, Q. Ji, Dynamic Probabilistic Graph Convolution for Facial Action Unit Intensity Estimation, in Proc. IEEE Conf. Comput. Vision Pattern Recogn., (2021), 4845–4854. https://doi.org/10.1109/CVPR46437.2021.00481
    [54] M. S. Schlichtkrull, T. N. Kipf, P. Bloem, R. Berg, I. Titov, M. Welling, Modeling Relational Data with Graph Convolutional Networks, In The Semantic Web: 15th Int. Conf., ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, 593–607. https://doi.org/10.1007/978-3-319-93417-4_38
    [55] Z. Huang, X. Li, Y. Ye, M. K. Ng, MR-GCN: Multi-Relational Graph Convolutional Networks based on Generalized Tensor Product, in Proc. Int. Joint Conf. Artif. Intell., (2020), 1258–1264. https://doi.org/10.24963/ijcai.2020/175
    [56] J. Chen, L. Pan, Z. Wei, X. Wang, C. W. Ngo, T. S. Chua, Zero-Shot Ingredient Recognition by Multi-Relational Graph Convolutional Network, in Proc. Int. Joint Conf. Artif. Intell., 34 (2020), 10542–10550. https://doi.org/10.1609/aaai.v34i07.6626
    [57] P. Gopalan, S. Gerrish, M. Freedman, D. Blei, D. Mimno, Scalable inference of overlapping communities, in Proc. Conf. Workshop on Neural Inform. Process. Syst., (2012), 2249–2257.
    [58] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, S. Süsstrunk, SLIC superpixels compared to state-of-the-art superpixel methods, IEEE Trans. Pattern Anal. Mach. Intell., 34 (2012), 2274–2282. https://doi.org/10.1109/TPAMI.2012.120 doi: 10.1109/TPAMI.2012.120
    [59] W. Zheng, P. Jing, Q. Xu, Action Recognition Based on Spatial Temporal Graph Convolutional Networks, in Proc. 3rd Int. Conf. Comput. Sci. Appl. Eng., 118 (2019), 1–5. https://doi.org/10.1145/3331453.3361651
    [60] D. Tian, Z. Lu, X. Chen, L. Ma, An attentional spatial temporal graph convolutional network with co-occurrence feature learning for action recognition, Multimed. Tools Appl., 79 (2020), 12679–12697. https://doi.org/10.1007/s11042-020-08611-4 doi: 10.1007/s11042-020-08611-4
    [61] Y. Chen, G. Ma, C. Yuan, B. Li, H. Zhang, F. Wang, et al., Graph convolutional network with structure pooling and joint-wise channel attention for action recognition, Pattern Recogn., 103 (2020), 107321. https://doi.org/10.1016/j.patcog.2020.107321 doi: 10.1016/j.patcog.2020.107321
    [62] J. Dong, Y. Gao, H. J. Lee, H. Zhou, Y. Yao, Z. Fang, et al., Action Recognition Based on the Fusion of Graph Convolutional Networks with High Order Features, Appl. Sci., 10 (2020), 1482. https://doi.org/10.3390/app10041482 doi: 10.3390/app10041482
    [63] Z. Chen, S. Li, B. Yang, Q. Li, H. Liu, Multi-Scale Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition, in Proc. Int. Joint Conf. Artif. Intell., 35 (2021), 1113–1122. https://doi.org/10.1609/aaai.v35i2.16197
    [64] Y. Bin, Z. Chen, X. Wei, X. Chen, C. Gao, N. Sang, Structure-aware human pose estimation with graph convolutional networks, Pattern Recogn., 106 (2020), 107410. https://doi.org/10.1016/j.patcog.2020.107410 doi: 10.1016/j.patcog.2020.107410
    [65] R. Wang, C. Huang, X. Wang, Global Relation Reasoning Graph Convolutional Networks for Human Pose Estimation, IEEE Access, 8 (2020), 38472–38480. https://doi.org/10.1109/ACCESS.2020.2973039 doi: 10.1109/ACCESS.2020.2973039
    [66] T. Sofianos, A. Sampieri, L. Franco, F. Galasso, Space-Time-Separable Graph Convolutional Network for Pose Forecasting, in Proc. IEEE/ICCV Int. Conf. Comput. Vision, (2021), 11209–11218. https://doi.org/10.48550/arXiv.2110.04573
    [67] Z. Zou, W. Tang, Modulated Graph Convolutional Network for 3D Human Pose Estimation, in Proc. ICCV, (2021), 11457–11467. https://doi.org/10.1109/ICCV48922.2021.01128
    [68] B. Yu, H. Yin, Z. Zhu, Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting, in Proc. Int. Joint Conf. Artif. Intell., (2018), 3634–3640. https://doi.org/10.24963/ijcai.2018/505
    [69] Y. Han, S. Wang, Y. Ren, C. Wang, P. Gao, G. Chen, Predicting Station-Level Short-Term Passenger Flow in a Citywide Metro Network Using Spatiotemporal Graph Convolutional Neural Networks, ISPRS Int. J. Geo-Inform., 8 (2019), 243. https://doi.org/10.3390/ijgi8060243 doi: 10.3390/ijgi8060243
    [70] B. Zhao, X. Gao, J. Liu, J. Zhao, C. Xu, Spatiotemporal Data Fusion in Graph Convolutional Networks for Traffic Prediction, IEEE Access, 8 (2020), 76632–76641. https://doi.org/10.1109/ACCESS.2020.2989443 doi: 10.1109/ACCESS.2020.2989443
    [71] L. Ge, H. Li, J. Liu, A. Zhou, Temporal Graph Convolutional Networks for Traffic Speed Prediction Considering External Factors, in Proc. Int. Conf. Mobile Data Manag., (2019), 234–242. https://doi.org/10.1109/MDM.2019.00-52
    [72] L. Ge, S. Li, Y. Wang, F. Chang, K. Wu, Global Spatial-Temporal Graph Convolutional Network for Urban Traffic Speed Prediction, Appl. Sci.-basel, 10 (2020), 1509. https://doi.org/10.3390/app10041509 doi: 10.3390/app10041509
    [73] P. Han, P. Yang, P. Zhao, S. Shang, Y. Liu, J. Zhou, et al., GCN-MF: Disease-Gene Association Identification by Graph Convolutional Networks and Matrix Factorization, Knowl. Disc. Data Min., (2019), 705–713. https://doi.org/10.1145/3292500.3330912 doi: 10.1145/3292500.3330912
    [74] J. Li, Z. Li, R. Nie, Z. You, W. Bao, FCGCNMDA: predicting miRNA-disease associations by applying fully connected graph convolutional networks, Mol. Genet. Genom., 295 (2020), 1197–1209. https://doi.org/10.1007/s00438-020-01693-7 doi: 10.1007/s00438-020-01693-7
    [75] L. Wang, Z. You, Y. Li, K. Zhang, Y. Huang, GCNCDA: A new method for predicting circRNA-disease associations based on Graph Convolutional Network Algorithm, PLoS Comput. Biol., 16 (2020), e1007568. https://doi.org/10.1371/journal.pcbi.1007568 doi: 10.1371/journal.pcbi.1007568
    [76] C. Wang, J. Guo, N. Zhao, Y. Liu, X. Liu, G. Liu, et al., A Cancer Survival Prediction Method Based on Graph Convolutional Network, IEEE Trans. NanoBiosci., 19 (2019), 117–126. https://doi.org/10.1109/TNB.2019.2936398 doi: 10.1109/TNB.2019.2936398
    [77] H. Chen, F. Zhuang, L. Xiao, L. Ma, H. Liu, R. Zhang, et al., AMA-GCN: Adaptive Multi-layer Aggregation Graph Convolutional Network for Disease Prediction, in Proc. IJCAI, (2021), 2235–2241. https://doi.org/10.24963/ijcai.2021/308
    [78] K. Gopinath, C. Desrosiers, H. Lombaert, Learnable Pooling in Graph Convolutional Networks for Brain Surface Analysis, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2022), 864–876. https://doi.org/10.1109/TPAMI.2020.3028391 doi: 10.1109/TPAMI.2020.3028391
    [79] R. Ying, R. He, K. Chen, Graph Convolutional Neural Networks for Web-Scale Recommender Systems, in Proc. Knowl. Disc. Data Min., (2018), 974–983. https://doi.org/10.1145/3219819.3219890
    [80] X. Xia, H. Yin, J. Yu, Q. Wang, L. Cui, X. Zhang, Self-Supervised Hypergraph Convolutional Networks for Session-based Recommendation, in Proc. Int. Joint Conf. Artif. Intell., 35 (2021), 4503–4511. https://doi.org/10.1609/aaai.v35i5.16578
    [81] H. Chen, L. Wang, Y. Lin, C. Yeh, F. Wang, H. Yang, Structured Graph Convolutional Networks with Stochastic Masks for Recommender Systems, in Proc. SIGIR, (2021), 614–623. https://doi.org/10.1145/3404835.3462868
    [82] L. Chen, Y. Xie, Z. Zheng, H. Zheng, J. Xie, Friend Recommendation Based on Multi-Social Graph Convolutional Network, IEEE Access, 8 (2020), 43618–43629. https://doi.org/10.1109/ACCESS.2020.2977407 doi: 10.1109/ACCESS.2020.2977407
    [83] T. Zhong, S. Zhang, F. Zhou, K. Zhang, G. Trajcevski, J. Wu, Hybrid graph convolutional networks with multi-head attention for location recommendation, World Wide Web, 23 (2020), 3125–33151. https://doi.org/10.1007/s11280-020-00824-9 doi: 10.1007/s11280-020-00824-9
    [84] T. H. Nguyen, R. Grishman, Graph Convolutional Networks with Argument-Aware Pooling for Event Detection, in Proc. AAAI Confer. Artif. Intell., 32 (2018). https://doi.org/10.1609/aaai.v32i1.12039
    [85] Z. Guo, Y. Zhang, W. Lu, Attention Guided Graph Convolutional Networks for Relation Extraction, Ann. Meet. Assoc. Comput. Linguist., (2019), 241–251. https://doi.org/10.18653/v1/P19-1024 doi: 10.18653/v1/P19-1024
    [86] Y. Hong, Y. Liu, S. Yang, K. Zhang, A. Wen, J. Hu, Improving Graph Convolutional Networks Based on Relation-Aware Attention for End-to-End Relation Extraction, IEEE Access, 8 (2020), 51315–51323. https://doi.org/10.1109/ACCESS.2020.2980859 doi: 10.1109/ACCESS.2020.2980859
    [87] Z. Meng, S. Tian, L. Yu, Y. Lv, Joint extraction of entities and relations based on character graph convolutional network and Multi-Head Self-Attention Mechanism, J. Exp. Theor. Artif. Intell., 33 (2021), 349–362. https://doi.org/10.1080/0952813X.2020.1744198 doi: 10.1080/0952813X.2020.1744198
    [88] L. Yao, C. Mao, Y. Luo, Graph Convolutional Networks for Text Classification, Artif. Intell., (2019), 7370–7377. https://doi.org/10.1609/aaai.v33i01.33017370 doi: 10.1609/aaai.v33i01.33017370
    [89] M. Chandra, D. Ganguly, P. Mitra, B. Pal, J. Thomas, NIP-GCN: An Augmented Graph Convolutional Network with Node Interaction Patterns, in Proc. SIGIR, (2021), 2242–2246. https://doi.org/10.1145/3404835.3463082
    [90] L. Xiao, X. Hu, Y. Chen, Y. Xue, D. Gu, B. Chen, et al., Targeted Sentiment Classification Based on Attentional Encoding and Graph Convolutional Networks, Appl. Sci., 10 (2020), 957. https://doi.org/10.3390/app10030957 doi: 10.3390/app10030957
    [91] P. Zhao, L. Hou, O. Wu, Modeling sentiment dependencies with graph convolutional networks for aspect-level sentiment classification, Knowl.-Based Syst., 193 (2020), 105443. https://doi.org/10.1016/j.knosys.2019.105443 doi: 10.1016/j.knosys.2019.105443
    [92] S. Jiang, Q. Chen, X. Liu, B. Hu, L. Zhang, Multi-hop Graph Convolutional Network with High-order Chebyshev Approximation for Text Reasoning, arXiv preprint, (2021), arXiv: 2106.05221. https://doi.org/10.18653/v1/2021.acl-long.513
    [93] R. Li, H. Chen, F. Feng, Z. Ma, X. Wang, E. Hovy, Dual Graph Convolutional Networks for Aspect-based Sentiment Analysis, in Proc. 59 Ann. Meet. Assoc. Comput. Linguist. And 11th Int. joint Conf. Nat. Language process., 1 (2021), 6319–6329.
    [94] L. Lv, J. Cheng, N. Peng, M. Fan, D. Zhao, J. Zhang, Auto-encoder based Graph Convolutional Networks for Online Financial Anti-fraud, IEEE Comput. Intell. Financ. Eng. Econ., (2019), 1–6. https://doi.org/10.1109/CIFEr.2019.8759109 doi: 10.1109/CIFEr.2019.8759109
    [95] C. Li, D. Goldwasser, Encoding Social Information with Graph Convolutional Networks for Political Perspective Detection in News Media, in Proc. 57th Ann. Meet. Assoc. Comput. Linguist., (2019), 2594–2604. https://doi.org/10.18653/v1/p19-1247
    [96] Y. Sun, T. He, J. Hu, H. Hang, B. Chen, Socially-Aware Graph Convolutional Network for Human Trajectory Prediction, in 2019 IEEE 3rd Inf. Technol. Network. Electron. Autom. Control Conf. (ITNEC), (2019), 325–333. https://doi.org/10.1109/ITNEC.2019.8729387
    [97] J. Chen, J. Li, M. Ahmed, J. Pang, M. Lu, X. Sun, Next Location Prediction with a Graph Convolutional Network Based on a Seq2seq Framework, KSII Trans. Internet Inf. Syst., 14 (2020), 1909–1928. https://doi.org/10.3837/tiis.2020.05.003 doi: 10.3837/tiis.2020.05.003
    [98] X. Li, Y. Xin, C. Zhao, Y. Yang, Y. Chen, Graph Convolutional Networks for Privacy Metrics in Online Social Networks, Appl. Sci.-Basel, 10 (2020), 1327. https://doi.org/10.3390/app10041327 doi: 10.3390/app10041327
  • This article has been cited by:

    1. Chang Xu, Yu Jia, Liehuang Zhu, Chuan Zhang, Guoxie Jin, Kashif Sharif, TDFL: Truth Discovery Based Byzantine Robust Federated Learning, 2022, 33, 1045-9219, 4835, 10.1109/TPDS.2022.3205714
    2. Jie Wen, Zhixia Zhang, Yang Lan, Zhihua Cui, Jianghui Cai, Wensheng Zhang, A survey on federated learning: challenges and applications, 2023, 14, 1868-8071, 513, 10.1007/s13042-022-01647-y
    3. Qingtie Li, Xuemei Wang, Shougang Ren, A Privacy Robust Aggregation Method Based on Federated Learning in the IoT, 2023, 12, 2079-9292, 2951, 10.3390/electronics12132951
    4. Wenbin Yao, Bangli Pan, Yingying Hou, Xiaoyong Li, Yamei Xia, An Adaptive Model Filtering Algorithm Based on Grubbs Test in Federated Learning, 2023, 25, 1099-4300, 715, 10.3390/e25050715
    5. Chang Zhang, Shunkun Yang, Lingfeng Mao, Huansheng Ning, Anomaly detection and defense techniques in federated learning: a comprehensive review, 2024, 57, 1573-7462, 10.1007/s10462-024-10796-1
    6. Hiralal Bhaskar Solunke, Pawan Bhaladhare, Amol Potgantwar, 2024, chapter 17, 9798369334942, 299, 10.4018/979-8-3693-3494-2.ch017
    7. Caiyu Su, Jinri Wei, Yuan Lei, Hongkun Xuan, Jiahui Li, Chenchu Xu, Empowering precise advertising with Fed-GANCC: A novel federated learning approach leveraging Generative Adversarial Networks and group clustering, 2024, 19, 1932-6203, e0298261, 10.1371/journal.pone.0298261
    8. Kai Hu, Sheng Gong, Qi Zhang, Chaowen Seng, Min Xia, Shanshan Jiang, An overview of implementing security and privacy in federated learning, 2024, 57, 1573-7462, 10.1007/s10462-024-10846-8
    9. S. Annamalai, N. Sangeetha, M. Kumaresan, Dommaraju Tejavarma, Gandhodi Harsha Vardhan, A. Suresh Kumar, 2025, 9781394219216, 127, 10.1002/9781394219230.ch7
    10. Zheng Yang, Ke Gu, Yiming Zuo, Byzantine Robust Federated Learning Scheme Based on Backdoor Triggers, 2024, 79, 1546-2226, 2813, 10.32604/cmc.2024.050025
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(4002) PDF downloads(441) Cited by(17)

Figures and Tables

Figures(14)  /  Tables(7)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog