
In this paper, the exponential bipartite consensus issue is investigated for multi-agent networks, whose dynamic is characterized by fractional diffusion partial differential equations (PDEs). The main contribution is that a novel exponential convergence principle is proposed for networks of fractional PDEs via aperiodically intermittent control scheme. First, under the aperiodically intermittent control strategy, an exponential convergence principle is developed for continuously differentiable function. Second, on the basis of the proposed convergence principle and the designed intermittent boundary control protocol, the exponential bipartite consensus condition is addressed in the form of linear matrix inequalities (LMIs). Compared with the existing works, the result of the exponential intermittent consensus presented in this paper is applied to the networks of PDEs. Finally, the high-speed aerospace vehicle model is applied to verify the effectiveness of the control protocol.
Citation: Xinxin Zhang, Huaiqin Wu. Bipartite consensus for multi-agent networks of fractional diffusion PDEs via aperiodically intermittent boundary control[J]. Mathematical Biosciences and Engineering, 2023, 20(7): 12649-12665. doi: 10.3934/mbe.2023563
[1] | Huiyang Xu . Existence and blow-up of solutions for finitely degenerate semilinear parabolic equations with singular potentials. Communications in Analysis and Mechanics, 2023, 15(2): 132-161. doi: 10.3934/cam.2023008 |
[2] | Tingfu Feng, Yan Dong, Kelei Zhang, Yan Zhu . Global existence and blow-up to coupled fourth-order parabolic systems arising from modeling epitaxial thin film growth. Communications in Analysis and Mechanics, 2025, 17(1): 263-289. doi: 10.3934/cam.2025011 |
[3] | Yuxuan Chen . Global dynamical behavior of solutions for finite degenerate fourth-order parabolic equations with mean curvature nonlinearity. Communications in Analysis and Mechanics, 2023, 15(4): 658-694. doi: 10.3934/cam.2023033 |
[4] | Yue Pang, Xiaotong Qiu, Runzhang Xu, Yanbing Yang . The Cauchy problem for general nonlinear wave equations with doubly dispersive. Communications in Analysis and Mechanics, 2024, 16(2): 416-430. doi: 10.3934/cam.2024019 |
[5] | Isaac Neal, Steve Shkoller, Vlad Vicol . A characteristics approach to shock formation in 2D Euler with azimuthal symmetry and entropy. Communications in Analysis and Mechanics, 2025, 17(1): 188-236. doi: 10.3934/cam.2025009 |
[6] | Fangyuan Dong . Multiple positive solutions for the logarithmic Schrödinger equation with a Coulomb potential. Communications in Analysis and Mechanics, 2024, 16(3): 487-508. doi: 10.3934/cam.2024023 |
[7] | Mustafa Avci . On an anisotropic →p(⋅)-Laplace equation with variable singular and sublinear nonlinearities. Communications in Analysis and Mechanics, 2024, 16(3): 554-577. doi: 10.3934/cam.2024026 |
[8] | Ying Chu, Bo Wen, Libo Cheng . Existence and blow up for viscoelastic hyperbolic equations with variable exponents. Communications in Analysis and Mechanics, 2024, 16(4): 717-737. doi: 10.3934/cam.2024032 |
[9] | Ho-Sik Lee, Youchan Kim . Boundary Riesz potential estimates for parabolic equations with measurable nonlinearities. Communications in Analysis and Mechanics, 2025, 17(1): 61-99. doi: 10.3934/cam.2025004 |
[10] | Chunyou Sun, Junyan Tan . Attractors for a Navier–Stokes–Allen–Cahn system with unmatched densities. Communications in Analysis and Mechanics, 2025, 17(1): 237-262. doi: 10.3934/cam.2025010 |
In this paper, the exponential bipartite consensus issue is investigated for multi-agent networks, whose dynamic is characterized by fractional diffusion partial differential equations (PDEs). The main contribution is that a novel exponential convergence principle is proposed for networks of fractional PDEs via aperiodically intermittent control scheme. First, under the aperiodically intermittent control strategy, an exponential convergence principle is developed for continuously differentiable function. Second, on the basis of the proposed convergence principle and the designed intermittent boundary control protocol, the exponential bipartite consensus condition is addressed in the form of linear matrix inequalities (LMIs). Compared with the existing works, the result of the exponential intermittent consensus presented in this paper is applied to the networks of PDEs. Finally, the high-speed aerospace vehicle model is applied to verify the effectiveness of the control protocol.
Nowadays, smart terminals, such as smart bracelets, are becoming increasingly popular. These devices are easy to carry and have many powerful sensors that can detect various inputs from the wearer's body [1,2]. These data have strong research value. Using these data to train machine learning models will hopefully make these smart terminals smarter to better serve people. As people pay increasing attention to the protection of data privacy [3,4], it is impossible and inadvisable to collect these private data, which leads to the problem of data silos [5]. However, if only local data is used for training, then the following problems emerge: 1) insufficient local data leads to model convergence difficulties and poor performance, and 2) limited diversity of data types hampers the model's generalization capability. As a solution, federated learning is proposed [6,7]. In federated learning, each participant uses their own private data to train the local model that is sent to the central server for aggregation to obtain the global model. Under the framework of federated learning, participants' private data are not exported locally, which protects participants' data privacy [8,9,10].
Although federated learning offers significant advantages in terms of data privacy protection compared to centralized training, it also faces numerous challenges [11,12]. Among these challenges, heterogeneity stands out as the most crucial one [13,14]. System heterogeneity arises when participants have varying storage capacities and computing power [15,16]. Statistical heterogeneity, on the other hand, emerges when participants exhibit different data type distributions and data volumes, leading to the non-IID (non-Independent and Identically Distributed) problem [17]. While hardware advancements have gradually addressed system heterogeneity, statistical heterogeneity remains a persistent challenge. Non-IID data are prevalent in real-life scenarios, and federated learning's performance experiences a notable decline when confronted with such data. The classical federated learning algorithm proposed by Google [18,19,20,21], FedAvg (Federated Averaging), fails to achieve superior performance across all clients compared to models trained locally by individual clients. Some participants even experience minimal benefits or inferior performance when participating in federated learning, discouraging their willingness to engage in the process. The existing FedAvg algorithm in federated learning no longer meets practical requirements, necessitating the development of a novel algorithm to address the statistical heterogeneity challenge posed by non-IID data distributions. This research aims to tackle this challenge and provide a solution in this paper.
The negative effects of non-IID data on FedAvg can be explained by the model parameters.
We define l(xi,yi;w) as the loss of the prediction on example (xi,yi) made with model parameters w. We aim for:
Minw∈Rdf(w)wheref(w)def=1n∑ni=1l(xi,yi;w) | (1.1) |
In the centralized machine learning environment, let w(c)t denote the weight after t−th update in the centralized machine learning. Then, the model parameters are updated in the following way:
w(c)t=w(c)t−1−η∇wf(w(c)t−1) | (1.2) |
In the federal learning environment, we assume there are K clients over which the data is partitioned, with Pk as the set of indexes of data points on client k, with nk=|Pk|. On each client, local training is conducted separately using local data:
w(k)t=w(k)t−1−η∇wf(w(k)t−1) | (1.3) |
Let w(f)t denote the weight calculated:
w(f)t=K∑k=1n(k)∑kk=1n(k)w(k)t | (1.4) |
Finally, we define:
weightdivergence=||w(f)t−w(c)t|| | (1.5) |
The divergence between w(k)t, w(f)t and w(c)t can be visualized from the Figures 1 and 2. When data is IID, for each client k, the data distribution is almost identical to the global data distribution and the divergence between w(k)t and w(c)t is small. Therefore, w(f)t obtained after aggregating different w(k)t according to Eq (1.4) also has very small divergence with w(c)t. After many iterations, w(f)t is still close to w(c)t and the weightdivergence is small. At this point, federated learning can perform very well. When data is non-IID, the large difference in the distribution of data owned by each client resulting in the divergence between w(k)t and w(k)t becomes larger and the divergence between w(k)t and w(c)t also becomes larger. Therefore, the divergence between w(f)t and w(c)t also becomes much larger and accumulates very fast. After many iterations, the weightdivergence becomes larger and larger. From the above analysis, it is concluded that the negative impact of non-IID data on federated learning is mainly due to the difference in data distribution of clients. Based on this, we improve the FedAvg algorithm and propose a new algorithm, which clusters clients with similar data distribution to find local IID data in the non-IID data distribution to solve the non-IID problem.
Contributions of this paper:
1) In this paper, FedSC algorithm is proposed to improve the accuracy of federated learning in the case of data imbalance, which helps to solve the problem of data heterogeneity in federated learning.
2) This paper reproduces a variety of federated learning algorithms, such as FedNova [22], SCAFFOLD [23], FedProx [17], etc. Through the comparisons, it demonstrates that FedSC consistently delivers strong performance across different scenarios.
3) The data training of the FedSC algorithm is performed only locally and no local data transmission is involved, which ensures data privacy and security for all participants, which helps to solve the problem of "data silos".
The structure of this paper is as follows. Section 2 presents the related work. Section 3 describes the FedSC algorithm flow in detail as well as the implementation details of each part. Section 4 presents the experimental results. Section 5 gives a summary of the full text.
Non-IID data distribution is common in real life [24], for example, different regions may have completely different vegetation distribution [25]. Due to data regulation and privacy concerns, meaningful real federated datasets are difficult to obtain [26]. In this paper, we use the Dirichlet distribution to simulate the non-IID distribution of data, where each client is allocated a proportion of the samples of each label according to Dirichlet distribution. Dirichlet distribution is an appropriate choice to simulate non-IID data distribution and has been used in many studies [22,27,28]. Specifically, we sample pk=DirN(β) and allocate a pk,j proportion of the instances of class k to the party j. Here, DirN(β) denotes the Dirichlet distribution and β is a concentration parameter (β>0). An advantage of this approach is that we can flexibly change the imbalance level by varying the concentration parameter β. If β is set to a smaller value, then the partition is more unbalanced.
The biggest challenge of federated learning is that its performance deteriorates when data are distributed as non-IID [29,30]. Many methods have been put forward to solve this problem [31,32,33,34]. The FedProx algorithm is proposed in [17], which improves the local objective function based on the FedAvg algorithm. The FedProx algorithm uses an additional adjustment term in the local objective function to limit the distance between the local model and the global model so that the local model will not be too scattered. This helps to avoid local inconsistencies and improves the generalization of the model. In addition, FedProx can balance the distance between the global and local models by adjusting the regularization hyperparameters to better accommodate different data distributions. However, its shortcomings are also obvious. The client needs to carefully adjust the proportion of the adjustment item in the local objective function. If the proportion is too small, then the adjustment has little effect. If the proportion is too large, then the local update is very small and the local model converges slowly. [22] considers that heterogeneity in the clients' local datasets and computation speeds results in large variations in the number of local updates performed by each client in each communication round. Simple aggregation of such models causes the global model updates are biased. To tackle this challenge, FedNova improve FedAvg in the aggregation stage, which normalizes and scales the local updates of each party according to their number of local steps before updating the global model. Therefore, the client can iterate autonomously to make better use of the information from local data, improving the generalization capability of the model. The FedNova algorithm achieves better performance on some non-IID datasets. Scaffold [23] introduces two variates (server variate and client variate), which are used to estimate the update direction of the server model and the update direction of each client. Then the difference between these two variables is used to approximate the bias in local training. Finally, the local updates are corrected by adding the difference to the local training. The Scaffold algorithm makes the federated learning model more stable, avoiding instability caused by the non-IID datasets. However, due to additional control variables, Scaffold doubles the size of each communication round compared to FedAvg. In [35], a federated learning method named FedCPF is proposed to be applied to vehicle edge computing scenarios in 6G communication networks. The method improves the efficiency and accuracy of federated learning by improving the communication method and optimizing the local training process, while reducing the communication overhead. There are also some researchers who aim to improve the distribution of non-IID data by sharing a certain percentage of the data, such as FedShare [36], Hybrid-FL [37], etc. Although the participants can be seen as trustworthy, direct data transfer still risks privacy leakage, which is against the original purpose of federated learning. In [38], the author divides clients according to the cosine similarity between the weight-updates of different clients to realize federated training of multiple models. However, it generates different personalized models for different clients rather than a unified model. In [39], TiFL was proposed, which divides participants into multiple levels according to the different performance of each participant. TiFL selects clients from the same level in every round of training in order to alleviate the discrete problem caused by the heterogeneity of resources and quantity owned by participants. The concept of client-side clustering is also employed in this paper to address the issue of non-IID data distribution. By classifying and aggregating clients based on their data distribution, the goal is to achieve a more balanced data distribution within each cluster. This approach aims to minimize the adverse effects of non-IID data on the performance of federated learning algorithms.
This section describes the FedSC algorithm in detail.
Non-IID data make local model parameters divergent in the federated learning so that the central server cannot aggregate a good model; however, federated learning can give satisfactory results with IID data. Thus, we cluster the clients according to their data distribution. The clients with high data distribution similarity are divided into a cluster; hence, data distribution in this cluster is similar to IID. At this time, federated optimization within each cluster can greatly improve the performance of the model and can improve the efficiency.
Let ni be the total amount of data samples owned by client i, nij be the number of class j data samples owned by client i, and define the data attribute of client i as Ii=[ni1/ni,ni2/ni,...,nij/ni]. Clients are clustered based on their data attributes.
A bottom-up clustering algorithm is proposed. Initially, each client is treated as an individual cluster, and clusters are progressively merged at each step. Ultimately, a cluster encompassing all samples is obtained. The detailed algorithm steps are as follows:
1) Treat each client as an individual cluster.
2) Calculate the distance between two clusters and merge the cluster with the smallest distance. In this case, the distance between two clusters is defined as the maximum distance between any pair of clients in the two clusters. Specifically, the distance between cluster p and cluster q is defined as follows:
Dp,q=max{dij=||Ii−Ij||2|i∈p,j∈q} | (3.1) |
where dij represents the distance between the data attributes Ii of client i in cluster p and the data attributes Ij of client j in cluster q.
3) Repeat step 2 and iteratively aggregate clusters until the number of clusters meets the requirements.
In this way, the data distribution among clients within the same cluster becomes similar. Consequently, federated learning can effectively operate within each cluster, yielding improved results.
In centralized machine learning, the model updates the model parameters by selecting a batch size of data at a time, as shown in Figure 3. All the data participate in the training process and each type of data makes its own contribution to the model fairly.
In FedAvg algorithm, the model parameters are updated by aggregating the model parameters of each client. The central server sends the initial model parameters w(f)t to the client. The client uses w(f)t as the model starting point to train with the local dataset and sends the trained model w(k)t to the central server. The central server aggregates the obtained models to obtain w(f)t+1. The training process of federated learning is shown in Figure 4.
There is a wide gap between federated learning and centralized machine learning with regard to the method of updating model parameters. Due to the clustering of clients, the traditional training process of the federated learning algorithm is no longer applicable. Therefore, we have made improvements to address this issue. We make the data in each cluster contribute to the global model of federated learning by imitating the batch processing of centralized machine learning. Treat the model aggregated by the central server from client models in a cluster as a model generated from batch-size data in centralized machine learning. Then, the central server transmits the aggregated model to the clients in another cluster for training, which corresponds to using other batch-size data to continue training the model in centralized machine learning as Figure 5.
First, each client sends its data attribute I to the central server. Because the information about the proportion of the client data categories may be leaked during this process, the client encrypts the data attribute I before sending it to the central server. After receiving data attributes from all clients and decrypting them, the central server uses the clustering algorithm to aggregate all clients into N clusters. Next, the central server selects a C−fraction of clients from the first cluster and send the initial global model parameters w(f)0 to them. After the clients train E epochs with their private data, the obtained local model parameters are uploaded to the central server. The central server performs a weighted average of the obtained model parameters according to the number of data samples of the clients to obtain the global model parameter w(f)1. Then, the central server selects a C−fraction of clients from the second cluster and sends the global model parameters to the selected clients. The selected client takes w(f)1 as the initial model parameter, uses the local dataset for training E epochs, and sends the trained model parameters to the central server. After obtaining the model parameters of all clients selected from the second cluster, the central server aggregates them to obtain the global model parameter w(f)2. Subsequently, w(f)2 is sent to the clients selected from the third cluster, and so on. This process is repeated for N aggregations, enabling the original model parameter to be trained with the data from N clusters and obtain the model parameter w(f)N. The model parameter w(f)N contains the knowledge learned from N cluster data. This constitutes a communication process. Next, the central server sends the model parameters w(f)N to the clients selected from the first cluster. This process is repeated until the model converges. In centralized machine learning, all data types are traversed in batch-size units. In this paper's proposed algorithm, all data types are traversed in units of data from clients selected from each cluster. In the FedSC algorithm, data training only occurs locally on the client side. The client and server solely exchange model parameters, without sharing any data. This approach mitigates the risk of data leakage to a certain extent and ensures the data security of clients. The schematic diagram of the FedSC algorithm is shown in Figure 6 and the pseudocode of the FedSC algorithm is shown in Algorithm 1.
Algorithm 1: FedSC algorithm |
Input: Local dataset for each client Output: FedSC algorithm Each client sends its own data attributes I to the central server; Serverexecution: Cluster according to data attribute I of each client; Initialize the global model parameter w0; ![]() |
To investigate the effectiveness of FedSC algorithms on non-IID data setting, we conduct extensive experiments on three public datasets, including two image datasets (i.e., MNIST [40], FEMNIST [41]) and one tabular dataset (we collect 6000 pieces each of 10 kinds of attack flows from the network security dataset CICIDS2017 [42] as the training set, and 1000 pieces each of the corresponding attack flows as the test set). The statistics of the datasets are summarized in Table 1. We use an MLP with three hidden layers as the base model. We also use the SGD optimizer with learning rate 0.01 and the batch size is set to 64. Furthermore, we reproduce FedAvg, FedProx, Scaffold, and FedNova algorithms and run all the algorithms for the same number of rounds for fair comparison. By default, the number of rounds is set to 100, the number of clients is set to 100(P), the number of local epochs is set to 1(E), the percentage of clients selected is set to 1(C), the β of Dirichlet distribution is set to 0.5(D), the added noise is set to 0(Z) and the number of clustering is set to 10(G), unless state otherwise. Because the selection of clients in federated learning has a certain degree of randomness, the average accuracy rate is used as the standard instead of the highest accuracy rate. The model building and simulation in this paper are implemented on Python3.7. The pytorch-gpu1.10.1 framework is mainly used to build the model.
Datasets | Training instances | Test instance | Features | Classes |
MNIST | 60,000 | 10,000 | 784 | 10 |
FMNIST | 60,000 | 10,000 | 784 | 10 |
CICIDS2017 | 60,000 | 10,000 | 256 | 10 |
The order of data samples has an important impact on machine learning. Here, we explore the influence of data sample order on FedSC algorithm. When data samples are arranged in a certain rule, machine learning learns this rule as a feature, which leads to overfitting. The clients are clustered and experimented, and then the order of the clusters is shuffled and experimented again for comparison. The experimental results are shown in Figure 7.
It can be seen from Figure 7 that the order of data samples has little influence on FedSC algorithm. In the FedSC algorithm, clients are randomly selected in each round of communication and aggregated in the cluster. These operations effectively mitigate overfitting, minimize the influence of data sample order on the model, and enhance the model's generalization ability.
In the FedSC algorithm, it is important to determine the optimal number of client clusters. In this experiment, the data is divided into 100 clients, and the number of clusters G is varied as 1, 10, 50, and 100. The experimental results are shown in Figure 8.
As can be seen from Figure 8, it is reasonable that the more clusters there are, the better the final result will be. In the FedSC algorithm, the model aggregated within each cluster is treated as a model trained on a batch-size of data in traditional machine learning. When the number of client clusters is small, each cluster contains a larger number of clients, resulting in scattered data distributions among clients within the cluster. This makes it challenging to aggregate a high-quality model, leading to poor performance of the final global model. When the number of clusters is large, a better model can be aggregated, but a large number of serial training consumes a long time. When G = 1, the FedSC algorithm becomes the FedAvg algorithm, which is affected by the non-IID and cannot aggregate a good global model. When G = 100, each client is equivalent to batch-size in traditional machine learning, and better results can be obtained at the cost of consuming more time. In order to balance the effect of the model and the cost of time consumed, the following experiment set the number of client clusters to 10, that is, the number of data types.
The number of clients participating in federated learning has been increasing due to the advantages it offers in addressing data uncontrollability and data leakage issues associated with traditional centralized training. Thus, we verify the effect of the number of clients on the algorithm; that is, the effect of the hyperparameter P in the algorithm on the experiment, which controls the number of clients in federated learning. The results obtained are shown in Table 2 and Figure 9.
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
p=10 | 91.44% | 86.57% | 87.90% | 90.26% | 86.51% | |
MNIST | p=100 | 88.96% | 78.13% | 77.61% | 76.20% | 77.16% |
p=1000 | 73.40% | 44.21% | 44.14% | 47.13% | 41.98% | |
p=10 | 80.33% | 77.60% | 78.31% | 79.45% | 78.00% | |
FMNIST | p=100 | 65.56% | 68.47% | 68.34% | 66.69% | 67.82% |
p=1000 | 66.65% | 53.05% | 49.99% | 47.67% | 53.16% | |
p=10 | 81.20% | 69.27% | 69.59% | 71.94% | 69.38% | |
CICIDS2017 | p=100 | 58.59% | 49.15% | 51.16% | 50.28% | 49.18% |
p=1000 | 48.77% | 25.48% | 26.47% | 25.39% | 28.59% |
It can be seen from Table 2 and Figure 9 that the number of clients has a great impact on federated learning, but the FedSC algorithm is better than other algorithms. The accuracy of the FedAvg, FedNova, Scaffold and FedProx algorithms decreases significantly as the number of clients increases. For instance, when the number of clients increases from 10 to 1000, the accuracy decreases by almost half. When the total data volume is certain, the more clients there are, the less data volume each client has. Therefore, the local model trained by each client may not be accurate enough, resulting in a degradation of the performance of the global model. The accuracy of FedSC algorithm decreases as the number of clients increases, but it is still within an acceptable range. Too many clients also means more communication, which leads to increased communication overhead. Too many clients can also lead to an increase in federated learning training time, as all clients need to be trained before model aggregation can take place. Therefore, reasonable control of the number of clients is very important for the performance and efficiency of federated learning. It is important to ensure that the model can be adequately trained from diverse data while also ensuring training speed and communication efficiency.
Now, we verify the effect of different data distributions on federated learning, i.e., the effect of the hyperparameter D on the experiments. When D becomes small, the data distribution is biased towards non-IID. When D becomes large, the data distribution is biased towards IID. The results are shown in Table 3 and Figure 10.
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
D−>0 | 81.34% | 74.17% | 75.70% | 78.61% | 72.65% | |
MNIST | D=0.1 | 84.12% | 77.40% | 77.70% | 76.13% | 76.80% |
D−>∞ | 90.26% | 76.08% | 77.56% | 78.54% | 77.42% | |
D−>0 | 64.18% | 62.46% | 62.90% | 64.81% | 60.58% | |
FMNIST | D=0.1 | 68.23% | 66.60% | 68.28% | 67.15% | 65.88% |
D−>∞ | 80.38% | 66.55% | 66.43% | 66.83% | 66.85% | |
D−>0 | 47.99% | 31.70% | 46.35% | 47.46% | 31.77% | |
CICIDS2017 | D=0.1 | 51.70% | 43.84% | 48.70% | 49.14% | 42.39% |
D−>∞ | 80.52% | 56.53% | 56.55% | 56.64% | 56.05% |
It can be seen from Table 3 and Figure 10 that the FedSC algorithm shows better results regardless of whether the data distribution is biased towards IID or non-IID. Additionally, all algorithms exhibit better performance when the data distribution is biased towards IID compared to when it is biased towards non-IID. In [36], it is pointed out that forcing the average fusion of model parameters with large differences will lead to a decrease in model accuracy. When the data distribution is biased towards non-IID, the dataset owned by each client are obviously different, resulting in different models trained by each client. At this time, using the FedAvg algorithm, the model parameters provided by each client vary greatly and the model after server aggregation will have significant deviations, leading to model performance degradation. Although the FedProx, FedNova and Scaffold algorithms have improved the FedAvg algorithm, the results are still less than satisfactory. In the FedSC algorithm, clients in each cluster have roughly the same data so that the local model of clients in each cluster is roughly the same. The central server aggregates these local models from each cluster, resulting in a global model that incorporates more comprehensive characteristics of such data. As a result, the FedSC algorithm achieves better and more stable performance.
Here, we verify the influence of the number of clients selected in each round of communication on the algorithm; that is, the influence of the hyperparameter C in the algorithm on the experiment, which controls the number of parallel clients in federated learning. Each round, C of all clients are randomly selected from all clients. The results obtained are shown in Table 4 and Figure 11.
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
C=0.2 | 83.39% | 76.06% | 75.30% | 76.70% | 76.70% | |
MNIST | C=0.6 | 86.17% | 76.01% | 76.92% | 77.15% | 76.57% |
C=1.0 | 86.68% | 77.72% | 77.48% | 76.71% | 77.85% | |
C=0.2 | 73.29% | 65.03% | 64.63% | 66.20% | 65.65% | |
FMNIST | C=0.6 | 71.81% | 67.27% | 67.31% | 67.16% | 67.31% |
C=1.0 | 71.84% | 67.55% | 65.91% | 66.19% | 67.06% | |
C=0.2 | 65.18% | 45.16% | 48.24% | 51.25% | 43.99% | |
CICIDS2017 | C=0.6 | 66.93% | 48.73% | 51.49% | 51.99% | 46.38% |
C=1.0 | 68.14% | 46.96% | 51.20% | 50.32% | 49.76% |
It can be seen from Table 4 and Figure 11 that the FedSC algorithm consistently achieves a higher accuracy rate compared to other algorithms in all scenarios. This superiority can be attributed to the fact that the FedSC algorithm ensures that the data within each cluster is almost IID, resulting in similar parameters among local models. Consequently, a global model with improved performance can be effectively aggregated. However, the training process of the FedSC algorithm is very unstable and the curve fluctuates a lot when fewer clients are selected each time, which is a drawback of the FedSC. The Scaffold algorithm is very stable in the training process with fewer selected clients, but its communication per round increases due to the additional variables introduced by the Scaffold algorithm.
It can also be seen that regardless of the FedSC algorithm or other algorithms, the more clients selected each time, the smaller the fluctuation of the training result, the more stable the training process, and the smoother the training curve. This is due to the fact that selecting more clients increases the number of data samples seen in each training round, resulting in more accurate results. Therefore, there will be no large fluctuations after aggregation.
Increasing the number of clients selected in each round has a positive impact on all algorithms, as it improves stability and accuracy. Selecting more clients per round of training is an effective approach to enhance performance. However, because each client needs to upload or download model parameters from the central server for each round of training, selecting more clients means more traffic. The FedSC algorithm can achieve relatively good results when C is small, which means that few clients are needed to be selected in federated learning to achieve the desired accuracy and fewer clients means a reduction in total traffic. This is helpful to solve the problem of communication bottlenecks in federated learning.
Here, we verify the effect of the number of rounds updated locally by each client on federated learning, i.e., the effect of the hyperparameter E on the experiments. The results are shown in Table 5 and Figure 12.
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
E=1 | 86.07% | 77.12% | 76.47% | 75.63% | 78.05% | |
MNIST | E=10 | 92.31% | 87.50% | 87.66% | 89.99% | 87.34% |
E=20 | 93.46% | 89.01% | 89.39% | 91.44% | 88.93% | |
E=1 | 77.15% | 67.63% | 66.45% | 67.37% | 67.08% | |
FMNIST | E=10 | 80.85% | 78.79% | 79.09% | 80.28% | 78.38% |
E=20 | 82.58% | 80.08% | 80.56% | 81.83% | 79.91% | |
E=1 | 67.77% | 50.71% | 50.80% | 50.92% | 50.52% | |
CICIDS2017 | E=10 | 78.33% | 68.88% | 69.29% | 70.80% | 68.92% |
E=20 | 80.05% | 73.78% | 74.06% | 74.86% | 73.36% |
As can be seen from Table 5 and Figure 12, with the change of the number of local training rounds, the FedSC algorithm keeps a strong momentum. The average accuracy of FedSC algorithm is still at a high level and the training process is relatively stable, which is sufficient to demonstrate the superiority of the proposed FedSC algorithm.
It can be seen that the increase in the number of local training rounds has a certain improvement for all algorithms. When there are fewer local training rounds, the client may not be trained sufficiently locally and the uploaded model parameters may not be accurate enough, thus affecting the performance of the global model. Increasing the number of local training rounds enables the local model to learn the features of the local data of the client better, and the effect of the global model is naturally better and more stable. Therefore, increasing the number of local training rounds is a good way to improve the accuracy and stability of the algorithm. However, a larger number of local training epochs increases the local computation of the client, which increases the training time and communication burden. Therefore, the selection of local training rounds is crucial to the performance of federated learning, and it needs to consider the factors such as data set, model complexity, computing and communication resources.
Federated learning techniques are vulnerable to Byzantine failures [43,44,45], biased local datasets, and poisoning attacks. Here, we verify the performance of federated learning algorithm when the client is disturbed by noise. Gaussian noise is added to the dataset to make the client training results inaccurate. Gaussian noise is popular [46], especially in images. Add Gaussian noise to each pixel point in the image data and Gaussian noise to each data in the table data. The expression of Gaussian noise is ˆx∼Gau(μ,σ), set the mean value of Gaussian noise u=0 and change the variance of Gaussian noise σ. It can be seen from Figure 13 that the larger the variance of the added Gaussian noise, the blurrier the image.
Set the variance of Gaussian noise to 0, 1 and 5 for the mnist dataset and fmnist dataset, respectively, and set the variance of Gaussian noise to 0, 1 and 2 for the CICIDS2017 dataset. The experimental results are shown in Table 6 and Figure 14.
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
Z=0 | 87.26% | 78.61% | 77.80% | 77.70% | 78.28% | |
MNIST | Z=1 | 81.82% | 73.31% | 71.17% | 72.39% | 74.82% |
Z=5 | 64.74% | 56.56% | 55.56% | 53.39% | 49.91% | |
Z=0 | 74.08% | 68.15% | 67.85% | 67.12% | 67.53% | |
FMNIST | Z=1 | 72.50% | 61.67% | 61.74% | 61.03% | 62.38% |
Z=5 | 44.10% | 37.30% | 31.67% | 33.11% | 29.58% | |
Z=0 | 68.83% | 52.79% | 49.97% | 46.94% | 53.48% | |
CICIDS2017 | Z=1 | 53.94% | 44.24% | 45.35% | 43.71% | 48.95% |
Z=2 | 45.35% | 38.93% | 37.67% | 35.02% | 39.90% |
The addition of noise will disturb the information in the data set, which will have a negative impact on the performance of federated learning. With the addition of Gaussian noise, the data in the dataset becomes more complex and difficult to process, making it difficult for the client to capture the features of the data during local training. Therefore, adding noise makes the training results of some clients unreliable or inaccurate, thus affecting the update results of the global model of the federated learning algorithm. As can be seen from Figure 14, the performance of all federated learning algorithms declines as the level of noise increases. Although the FedSC algorithm is also affected by noise, compared with other algorithms FedSC algorithm still gives better results. The addition of Gaussian noise can increase the privacy of data, but it will also affect the performance of the model.
To compare the efficiency of different FL algorithms, we set P = 100, D = 0.5 C = 1, E = 1, Z = 0 for experiment. In Table 7, the total computation time of 100 communication rounds is shown, and in Table 8, the communication cost per client in each communication round is shown.
Mnist | Fmnist | CICIDS2017 | |
FedSC | 202 s | 202 s | 206 s |
FedAvg | 205 s | 201 s | 210 s |
FedNova | 195 s | 193 s | 206 s |
Scaffold | 264 s | 268 s | 272 s |
FedProx | 378 s | 368 s | 412 s |
Mnist | Fmnist | CICIDS2017 | |
FedSC | 1.21 M | 1.21 M | 0.41 M |
FedAvg | 1.21 M | 1.21 M | 0.41 M |
FedNova | 1.21 M | 1.21 M | 0.41 M |
Scaffold | 2.42 M | 2.42 M | 0.82 M |
FedProx | 1.21 M | 1.21 M | 0.41 M |
We can observe that the time consumed by the FedSC, Fedavg and FedNove algorithms is very close. However, the Scaffold algorithm introduces additional operations during training, resulting in slightly increased time consumption. FedProx directly modifies the objective, which causes additional computation overhead in the gradient descent of each batch and therefore takes the longest time. For the communication cost, the FedSC, FedAvg, FedNove and FedProx algorithms upload only their local model parameters per round, so the communication cost is the same. However, the Scaffold algorithm transmits additional control variables in each round, so its communication cost is twice that of the other algorithms.
Federated learning plays a pivotal role in the field of machine learning, particularly as the importance of data privacy continues to grow. Unfortunately, the accuracy of federated learning is significantly influenced by the distribution of data. When the data exhibits non-IID characteristics, the performance of federated learning deteriorates. This paper introduces a novel federated learning algorithm called FedSC, which outperforms other algorithms in non-IID data scenarios. The FedSC algorithm aims to identify local IID data within non-IID data to mitigate the adverse impact of non-IID data on federated learning. Through experimental comparisons, the FedSC algorithm consistently achieves higher accuracy compared to alternative federated learning algorithms. Moreover, the FedSC algorithm ensures data security by completing the model training without sharing data among participating parties. This contribution further advances the development of federated learning.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
The authors declare there is no conflicts of interest.
[1] |
C. Tomlin, G. J. Pappas, S. Sastry, Conflict resolution for air traffic management: a study in multiagent hybrid systems, IEEE Trans. Autom. Control, 43 (1998), 509–521. https://doi.org/10.1109/9.664154 doi: 10.1109/9.664154
![]() |
[2] |
Y. Zou, H. J. Zhang, W He, Adaptive coordinated formation control of heterogeneous vertical takeoff and landing UAVs subject to parametric uncertainties, IEEE Trans. Cybern., 52 (2022), 3184–3195. https://doi.org/10.1109/TCYB.2020.3009404 doi: 10.1109/TCYB.2020.3009404
![]() |
[3] |
L. Cao, Y. Q. Chen, Z. D. Zhang, H. N. Li, A. K. Misra, Predictive smooth variable structure filter for attitude synchronization estimation during satellite formation flying, IEEE Trans. Aerosp. Electron. Syst., 53 (2017), 1375–1383. https://doi.org/10.1109/TAES.2017.2671118 doi: 10.1109/TAES.2017.2671118
![]() |
[4] |
R. Olfati-Saber, R. Murray, Consensus problems in networks of agents with switching topology and time-delays, IEEE Trans. Autom. Control, 49 (2004), 1520–1533. https://doi.org/10.1109/TAC.2004.834113 doi: 10.1109/TAC.2004.834113
![]() |
[5] |
G. H. Wen, W. X. Zheng, On constructing multiple lyapunov functions for tracking control of multiple agents with switching topologies, IEEE Trans. Autom. Control, 64 (2019), 3796–3803. https://doi.org/10.1109/TAC.2018.2885079 doi: 10.1109/TAC.2018.2885079
![]() |
[6] |
H. Q. Li, X. F. Liao, T. W. Huang, W. Zhu, Event-triggering sampling based leader-following consensus in second-order multi-agent systems, IEEE Trans. Autom. Control, 60 (2015), 1998–2003. https://doi.org/10.1109/TAC.2014.2365073 doi: 10.1109/TAC.2014.2365073
![]() |
[7] | S. Wasserman, K. Faust, Social Network Analysis: Methods and Applications, Cambridge University Press, 1994. |
[8] |
C. Altafini, G. Lini, Predictable dynamics of opinion forming for networks with antagonistic interactions, IEEE Trans. Autom. Control, 60 (2015), 342–357. https://doi.org/10.1109/TAC.2014.2343371 doi: 10.1109/TAC.2014.2343371
![]() |
[9] |
C. Altafini, Consensus problems on networks with antagonistic interactions, IEEE Trans. Autom. Control, 58 (2013), 935–946. https://doi.org/10.1109/TAC.2012.2224251 doi: 10.1109/TAC.2012.2224251
![]() |
[10] |
A. H. Hu, Y. Y. Wang, J. D. Cao, A. Alsaedi, Event-triggered bipartite consensus of multi-agent systems with switching partial couplings and topologies, Inf. Sci., 521 (2020), 1–13. https://doi.org/10.1016/j.ins.2020.02.038 doi: 10.1016/j.ins.2020.02.038
![]() |
[11] |
P. Gong, Exponential bipartite consensus of fractional-order non-linear multi-agent systems in switching directed signed networks, IET Control Theory Appl., 14 (2020), 2582–2591. https://doi.org/10.1049/iet-cta.2019.1241 doi: 10.1049/iet-cta.2019.1241
![]() |
[12] |
M. Shahvali, A. Azarbahram, M. Naghibi-Sistani, J. Askari, Bipartite consensus control for fractional-order nonlinear multi-agent systems: An output constraint approach, Neurocomputing, 397 (2020), 212–223. https://doi.org/10.1016/j.neucom.2020.02.036 doi: 10.1016/j.neucom.2020.02.036
![]() |
[13] |
Y. Cheng, Y. H. Wu, B. Z. Guo, Absolute boundary stabilization for an axially moving Kirchhoff beam, Automatica, 129 (2021), 109667. https://doi.org/10.1016/j.automatica.2021.109667 doi: 10.1016/j.automatica.2021.109667
![]() |
[14] |
S. T. Le, Y. H. Wu, Y. Q. Guo, C. D. Vecchio, Game theoretic approach for a service function chain routing in NFV with coupled constraints, IEEE Trans. Circuits Syst. Ⅱ Express Briefs, 68 (2021), 3557–3561. https://doi.org/10.1109/TCSII.2021.3070025 doi: 10.1109/TCSII.2021.3070025
![]() |
[15] |
Y. Cheng, Y. H. Wu, B. Z. Guo, Boundary stability criterion for a nonlinear axially moving beam, IEEE Trans. Autom. Control, 67 (2022), 5714–5729. https://doi.org/10.1109/TAC.2021.3124754 doi: 10.1109/TAC.2021.3124754
![]() |
[16] |
X. N. Song, Q. Y. Zhang, M. Wang, S. Song, Distributed estimation for nonlinear PDE systems using space-sampling approach: applications to high-speed aerospace vehicle, Nonlinear Dyn., 106 (2021), 3183–3198. https://doi.org/10.1007/s11071-021-06725-4 doi: 10.1007/s11071-021-06725-4
![]() |
[17] |
A. Pilloni, A. Pisano, Y. Orlov, E. Usai, Consensus-based control for a network of diffusion PDEs with boundary local interaction, IEEE Trans. Autom. Control, 61 (2016), 2708–2713. https://doi.org/10.1109/TAC.2015.2506990 doi: 10.1109/TAC.2015.2506990
![]() |
[18] |
P. He, Consensus of uncertain parabolic PDE agents via adaptive unit-vector control scheme, IET Control Theory Appl., 12 (2018), 2488–2494. https://doi.org/10.1049/iet-cta.2018.5202 doi: 10.1049/iet-cta.2018.5202
![]() |
[19] |
Y. N. Chen, Z. Q. Zuo, Y. J. Wang, Bipartite consensus for a network of wave PDEs over a signed directed graph, Automatica, 129 (2021), 109640. https://doi.org/10.1016/j.automatica.2021.109640 doi: 10.1016/j.automatica.2021.109640
![]() |
[20] |
L. R. Zhao, H. Q. Wu, J. D. Cao, Finite/fixed-time bipartite consensus for networks of diffusion PDEs via event-triggered control, Inf. Sci., 609 (2022), 1435–1450. https://doi.org/10.1016/j.ins.2022.07.151 doi: 10.1016/j.ins.2022.07.151
![]() |
[21] |
X. H. Wang, H. Q. Wu, J. D. Cao, Global leader-following consensus in finite time for fractional-order multi-agent systems with discontinuous inherent dynamics subject to nonlinear growth, Nonlinear Anal. Hybrid Syst, 37 (2020), 100888. https://doi.org/10.1016/j.nahs.2020.100888 doi: 10.1016/j.nahs.2020.100888
![]() |
[22] |
Y. Q. Zhang, H. Q. Wu, J. D. Cao, Global mittag-leffler consensus for fractional singularly perturbed multi-agent systems with discontinuous inherent dynamics via event-triggered control strategy, J. Franklin Inst., 358 (2021), 2086–2114. https://doi.org/10.1016/j.jfranklin.2020.12.033 doi: 10.1016/j.jfranklin.2020.12.033
![]() |
[23] |
Y. Q. Zhang, H. Q. Wu, J. D. Cao, Group consensus in finite time for fractional multiagent systems with discontinuous inherent dynamics subject to h¨older growth, IEEE Trans. Cybern., 52 (2022), 4161–4172. https://doi.org/10.1109/TCYB.2020.3023704 doi: 10.1109/TCYB.2020.3023704
![]() |
[24] |
X. N. Li, H. Q. Wu, J. D. Cao, Prescribed-time synchronization in networks of piecewise smooth systems via a nonlinear dynamic event-triggered control strategy, Math. Comput. Simul., 203 (2023), 647–668. https://doi.org/10.1016/j.matcom.2022.07.010 doi: 10.1016/j.matcom.2022.07.010
![]() |
[25] |
H. G. Zhang, Z. Y. Gao, Y. C. Wang, Y. L. Cai, Leader-following exponential consensus of fractional-order descriptor multiagent systems with distributed event-triggered strategy, IEEE Trans. Syst. Man Cybern.: Syst., 52 (2022), 3967–3979. https://doi.org/10.1109/TSMC.2021.3082549 doi: 10.1109/TSMC.2021.3082549
![]() |
[26] |
P. Gong, Exponential bipartite consensus of fractional-order non-linear multi-agent systems in switching directed signed networks, IET Control Theory Appl., 14 (2020), 2582–2591. https://doi.org/10.1049/iet-cta.2019.1241 doi: 10.1049/iet-cta.2019.1241
![]() |
[27] |
B. Mbodje, G. Montseny, Boundary fractional derivative control of the wave equation, IEEE Trans. Autom. Control, 40 (1995), 378–382. https://doi.org/10.1109/9.341815 doi: 10.1109/9.341815
![]() |
[28] | F. D. Ge, Y. Q. Chen, Event-driven boundary control for time fractional diffusion systems under time-varying input disturbance, in 2018 Annual American Control Conference (ACC), (2018), 140–145. https://doi.org/10.23919/ACC.2018.8431000 |
[29] |
F. D. Ge, Y. Q. Chen, C. H. Kou, Boundary feedback stabilisation for the time fractional-order anomalous diffusion system, IET Control Theory Appl., 10 (2018), 1250–1257. https://doi.org/10.1049/iet-cta.2015.0882 doi: 10.1049/iet-cta.2015.0882
![]() |
[30] |
Y. Cao, Y. G. Kao, J. H. Park, H. B. Bao, Global Mittag-Leffler stability of the delayed fractional-coupled reaction-diffusion system on networks without strong connectedness, IEEE Trans. Neural Networks Learn. Syst., 33 (2021), 6473–6483. https://doi.org/10.1109/TNNLS.2021.3080830 doi: 10.1109/TNNLS.2021.3080830
![]() |
[31] |
J. D. Cao, G. Stamov, I. Stamova, S. Simeonov, Almost periodicity in impulsive fractional-order reaction-diffusion neural networks with time-varying delays, IEEE Trans. Cybern., 51 (2021), 151–161. https://doi.org/10.1109/TCYB.2020.2967625 doi: 10.1109/TCYB.2020.2967625
![]() |
[32] |
J. H. Qin, G. S. Zhang, W. X. Zheng, Y. Kang, Adaptive sliding mode consensus tracking for second-order nonlinear multiagent systems with actuator faults, IEEE Trans. Cybern., 49 (2019), 1605–1615. https://doi.org/10.1109/TCYB.2018.2805167 doi: 10.1109/TCYB.2018.2805167
![]() |
[33] |
J. Sun, C. Guo, L. Liu, Q. H. Shan, Adaptive consensus control of second-order nonlinear multi-agent systems with event-dependent intermittent communications, J. Franklin Inst., 360 (2023), 2289–2306. https://doi.org/10.1016/j.jfranklin.2022.10.045 doi: 10.1016/j.jfranklin.2022.10.045
![]() |
[34] |
J. Wang, M. Krstic, Output-feedback boundary control of a heat PDE sandwiched between two ODEs, IEEE Trans. Autom. Control, 64 (2019), 4653–4660. https://doi.org/10.1109/TAC.2019.2901704 doi: 10.1109/TAC.2019.2901704
![]() |
[35] |
J. Sun, Z. S. Wang, Event-triggered consensus control of high-order multi-agent systems with arbitrary switching topologies via model partitioning approach, Neurocomputing, 413 (2020), 14–22. https://doi.org/10.1016/j.neucom.2020.06.058 doi: 10.1016/j.neucom.2020.06.058
![]() |
[36] |
X. Z. Liu, K. N. Wu, Z. T. Li, Exponential stabilization of reaction-diffusion systems via intermittent boundary control, IEEE Trans. Autom. Control, 67 (2022), 3036–3042. https://doi.org/10.1109/TAC.2021.3100289 doi: 10.1109/TAC.2021.3100289
![]() |
[37] |
X. Z. Liu, K. N. Wu, W. H. Zhang, Intermittent boundary stabilization of stochastic reaction-diffusion Cohen -Grossberg neural networks, Neural Networks, 131 (2020), 1–13. https://doi.org/10.1016/j.neunet.2020.07.019 doi: 10.1016/j.neunet.2020.07.019
![]() |
[38] |
X. Y. Li, Q. L. Fan, X. Z. Liu, K. N. Wu, Boundary intermittent stabilization for delay reaction-diffusion cellular neural networks, Neural Comput. Appl., 34 (2022), 18561–18577. https://doi.org/10.1007/s00521-022-07457-1 doi: 10.1007/s00521-022-07457-1
![]() |
[39] |
N. Espitia, A. Polyakov, D. Efimov, W. Perruquetti, Boundary time-varying feedbacks for fixed-time stabilization of constant-parameter reaction-diffusion systems, Automatica, 103 (2019), 398–407. https://doi.org/10.1016/j.automatica.2019.02.013 doi: 10.1016/j.automatica.2019.02.013
![]() |
[40] |
T. Hashimoto, M. Krstic, Stabilization of reaction-diffusion equations with state delay using boundary control input, IEEE Trans. Autom. Control, 61 (2016), 4041–4047. https://doi.org/10.1109/TAC.2016.2539001 doi: 10.1109/TAC.2016.2539001
![]() |
[41] |
C. Prieur, E. Trˊelat, Feedback stabilization of a 1-D linear reaction-diffusion equation with delay boundary control, IEEE Trans. Autom. Control, 64 (2019), 1415–1425. https://doi.org/10.1109/TAC.2018.2849560 doi: 10.1109/TAC.2018.2849560
![]() |
[42] |
J. Sun, Z. S. Wang, Consensus of multi-agent systems with intermittent communications via sampling time unit approach, Neurocomputing, 397 (2020), 149–159. https://doi.org/10.1016/j.neucom.2020.02.055 doi: 10.1016/j.neucom.2020.02.055
![]() |
[43] |
Z. S. Wang, J. Sun, H. G. Zhang, Stability analysis of T-S fuzzy control system with sampled-dropouts based on time-varying Lyapunov function method, IEEE Trans. Syst. Man Cybern.: Syst., 50 (2020), 2566–2577. https://doi.org/10.1109/TSMC.2018.2822482 doi: 10.1109/TSMC.2018.2822482
![]() |
[44] |
Z. B. Wang, H. Q. Wu, Stabilization in finite time for fractional-order hyperchaotic electromechanical gyrostat systems, Mech. Syst. Signal Proc., 111 (2018), 628–642. https://doi.org/10.1016/j.ymssp.2018.04.009 doi: 10.1016/j.ymssp.2018.04.009
![]() |
[45] |
C. D. Yang, T. W. Huang, A. C. Zhang, J. L. Qiu, J. D. Cao, F. E. Alsaadi, Output consensus of multiagent systems based on PDEs with input constraint: A boundary control approach, IEEE Trans. Syst. Man Cybern.: Syst., 51 (2021), 370–377. https://doi.org/10.1109/TSMC.2018.2871615 doi: 10.1109/TSMC.2018.2871615
![]() |
[46] |
P. Gong, Q. L. Han, W. Y. Lan, Finite-time consensus tracking for incommensurate fractional-order nonlinear multiagent systems with directed switching topologies, IEEE Trans. Cybern., 52 (2022), 65–76. https://doi.org/10.1109/TCYB.2020.2977169 doi: 10.1109/TCYB.2020.2977169
![]() |
[47] |
S. Q. Zhang, Monotone method for initial value problem for fractional diffusion equation, Sci. China Ser. A: Math., 49 (2006), 1223–1230. https://doi.org/10.1007/s11425-006-2020-6 doi: 10.1007/s11425-006-2020-6
![]() |
[48] |
V. Yadav, R. Padhi, S. N. Balakrishnan, Robust/optimal temperature profile control of a high-speed aerospace vehicle using neural networks, IEEE Trans. Neural Networks, 18 (2007), 1115–1128. https://doi.org/10.1109/TNN.2007.899229 doi: 10.1109/TNN.2007.899229
![]() |
1. | Xizheng Sun, Fengjie Li, Singular solutions in a pseudo-parabolic p-Laplacian equation involving singular potential, 2024, 0035-5038, 10.1007/s11587-024-00898-x | |
2. | Mohammad Shahrouzi, Salah Boulaaras, Rashid Jan, Well-posedness and blow-up of solutions for the p(l)-biharmonic wave equation with singular dissipation and variable-exponent logarithmic source, 2025, 16, 1662-9981, 10.1007/s11868-025-00680-z |
Datasets | Training instances | Test instance | Features | Classes |
MNIST | 60,000 | 10,000 | 784 | 10 |
FMNIST | 60,000 | 10,000 | 784 | 10 |
CICIDS2017 | 60,000 | 10,000 | 256 | 10 |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
p=10 | 91.44% | 86.57% | 87.90% | 90.26% | 86.51% | |
MNIST | p=100 | 88.96% | 78.13% | 77.61% | 76.20% | 77.16% |
p=1000 | 73.40% | 44.21% | 44.14% | 47.13% | 41.98% | |
p=10 | 80.33% | 77.60% | 78.31% | 79.45% | 78.00% | |
FMNIST | p=100 | 65.56% | 68.47% | 68.34% | 66.69% | 67.82% |
p=1000 | 66.65% | 53.05% | 49.99% | 47.67% | 53.16% | |
p=10 | 81.20% | 69.27% | 69.59% | 71.94% | 69.38% | |
CICIDS2017 | p=100 | 58.59% | 49.15% | 51.16% | 50.28% | 49.18% |
p=1000 | 48.77% | 25.48% | 26.47% | 25.39% | 28.59% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
D−>0 | 81.34% | 74.17% | 75.70% | 78.61% | 72.65% | |
MNIST | D=0.1 | 84.12% | 77.40% | 77.70% | 76.13% | 76.80% |
D−>∞ | 90.26% | 76.08% | 77.56% | 78.54% | 77.42% | |
D−>0 | 64.18% | 62.46% | 62.90% | 64.81% | 60.58% | |
FMNIST | D=0.1 | 68.23% | 66.60% | 68.28% | 67.15% | 65.88% |
D−>∞ | 80.38% | 66.55% | 66.43% | 66.83% | 66.85% | |
D−>0 | 47.99% | 31.70% | 46.35% | 47.46% | 31.77% | |
CICIDS2017 | D=0.1 | 51.70% | 43.84% | 48.70% | 49.14% | 42.39% |
D−>∞ | 80.52% | 56.53% | 56.55% | 56.64% | 56.05% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
C=0.2 | 83.39% | 76.06% | 75.30% | 76.70% | 76.70% | |
MNIST | C=0.6 | 86.17% | 76.01% | 76.92% | 77.15% | 76.57% |
C=1.0 | 86.68% | 77.72% | 77.48% | 76.71% | 77.85% | |
C=0.2 | 73.29% | 65.03% | 64.63% | 66.20% | 65.65% | |
FMNIST | C=0.6 | 71.81% | 67.27% | 67.31% | 67.16% | 67.31% |
C=1.0 | 71.84% | 67.55% | 65.91% | 66.19% | 67.06% | |
C=0.2 | 65.18% | 45.16% | 48.24% | 51.25% | 43.99% | |
CICIDS2017 | C=0.6 | 66.93% | 48.73% | 51.49% | 51.99% | 46.38% |
C=1.0 | 68.14% | 46.96% | 51.20% | 50.32% | 49.76% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
E=1 | 86.07% | 77.12% | 76.47% | 75.63% | 78.05% | |
MNIST | E=10 | 92.31% | 87.50% | 87.66% | 89.99% | 87.34% |
E=20 | 93.46% | 89.01% | 89.39% | 91.44% | 88.93% | |
E=1 | 77.15% | 67.63% | 66.45% | 67.37% | 67.08% | |
FMNIST | E=10 | 80.85% | 78.79% | 79.09% | 80.28% | 78.38% |
E=20 | 82.58% | 80.08% | 80.56% | 81.83% | 79.91% | |
E=1 | 67.77% | 50.71% | 50.80% | 50.92% | 50.52% | |
CICIDS2017 | E=10 | 78.33% | 68.88% | 69.29% | 70.80% | 68.92% |
E=20 | 80.05% | 73.78% | 74.06% | 74.86% | 73.36% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
Z=0 | 87.26% | 78.61% | 77.80% | 77.70% | 78.28% | |
MNIST | Z=1 | 81.82% | 73.31% | 71.17% | 72.39% | 74.82% |
Z=5 | 64.74% | 56.56% | 55.56% | 53.39% | 49.91% | |
Z=0 | 74.08% | 68.15% | 67.85% | 67.12% | 67.53% | |
FMNIST | Z=1 | 72.50% | 61.67% | 61.74% | 61.03% | 62.38% |
Z=5 | 44.10% | 37.30% | 31.67% | 33.11% | 29.58% | |
Z=0 | 68.83% | 52.79% | 49.97% | 46.94% | 53.48% | |
CICIDS2017 | Z=1 | 53.94% | 44.24% | 45.35% | 43.71% | 48.95% |
Z=2 | 45.35% | 38.93% | 37.67% | 35.02% | 39.90% |
Mnist | Fmnist | CICIDS2017 | |
FedSC | 202 s | 202 s | 206 s |
FedAvg | 205 s | 201 s | 210 s |
FedNova | 195 s | 193 s | 206 s |
Scaffold | 264 s | 268 s | 272 s |
FedProx | 378 s | 368 s | 412 s |
Mnist | Fmnist | CICIDS2017 | |
FedSC | 1.21 M | 1.21 M | 0.41 M |
FedAvg | 1.21 M | 1.21 M | 0.41 M |
FedNova | 1.21 M | 1.21 M | 0.41 M |
Scaffold | 2.42 M | 2.42 M | 0.82 M |
FedProx | 1.21 M | 1.21 M | 0.41 M |
Datasets | Training instances | Test instance | Features | Classes |
MNIST | 60,000 | 10,000 | 784 | 10 |
FMNIST | 60,000 | 10,000 | 784 | 10 |
CICIDS2017 | 60,000 | 10,000 | 256 | 10 |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
p=10 | 91.44% | 86.57% | 87.90% | 90.26% | 86.51% | |
MNIST | p=100 | 88.96% | 78.13% | 77.61% | 76.20% | 77.16% |
p=1000 | 73.40% | 44.21% | 44.14% | 47.13% | 41.98% | |
p=10 | 80.33% | 77.60% | 78.31% | 79.45% | 78.00% | |
FMNIST | p=100 | 65.56% | 68.47% | 68.34% | 66.69% | 67.82% |
p=1000 | 66.65% | 53.05% | 49.99% | 47.67% | 53.16% | |
p=10 | 81.20% | 69.27% | 69.59% | 71.94% | 69.38% | |
CICIDS2017 | p=100 | 58.59% | 49.15% | 51.16% | 50.28% | 49.18% |
p=1000 | 48.77% | 25.48% | 26.47% | 25.39% | 28.59% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
D−>0 | 81.34% | 74.17% | 75.70% | 78.61% | 72.65% | |
MNIST | D=0.1 | 84.12% | 77.40% | 77.70% | 76.13% | 76.80% |
D−>∞ | 90.26% | 76.08% | 77.56% | 78.54% | 77.42% | |
D−>0 | 64.18% | 62.46% | 62.90% | 64.81% | 60.58% | |
FMNIST | D=0.1 | 68.23% | 66.60% | 68.28% | 67.15% | 65.88% |
D−>∞ | 80.38% | 66.55% | 66.43% | 66.83% | 66.85% | |
D−>0 | 47.99% | 31.70% | 46.35% | 47.46% | 31.77% | |
CICIDS2017 | D=0.1 | 51.70% | 43.84% | 48.70% | 49.14% | 42.39% |
D−>∞ | 80.52% | 56.53% | 56.55% | 56.64% | 56.05% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
C=0.2 | 83.39% | 76.06% | 75.30% | 76.70% | 76.70% | |
MNIST | C=0.6 | 86.17% | 76.01% | 76.92% | 77.15% | 76.57% |
C=1.0 | 86.68% | 77.72% | 77.48% | 76.71% | 77.85% | |
C=0.2 | 73.29% | 65.03% | 64.63% | 66.20% | 65.65% | |
FMNIST | C=0.6 | 71.81% | 67.27% | 67.31% | 67.16% | 67.31% |
C=1.0 | 71.84% | 67.55% | 65.91% | 66.19% | 67.06% | |
C=0.2 | 65.18% | 45.16% | 48.24% | 51.25% | 43.99% | |
CICIDS2017 | C=0.6 | 66.93% | 48.73% | 51.49% | 51.99% | 46.38% |
C=1.0 | 68.14% | 46.96% | 51.20% | 50.32% | 49.76% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
E=1 | 86.07% | 77.12% | 76.47% | 75.63% | 78.05% | |
MNIST | E=10 | 92.31% | 87.50% | 87.66% | 89.99% | 87.34% |
E=20 | 93.46% | 89.01% | 89.39% | 91.44% | 88.93% | |
E=1 | 77.15% | 67.63% | 66.45% | 67.37% | 67.08% | |
FMNIST | E=10 | 80.85% | 78.79% | 79.09% | 80.28% | 78.38% |
E=20 | 82.58% | 80.08% | 80.56% | 81.83% | 79.91% | |
E=1 | 67.77% | 50.71% | 50.80% | 50.92% | 50.52% | |
CICIDS2017 | E=10 | 78.33% | 68.88% | 69.29% | 70.80% | 68.92% |
E=20 | 80.05% | 73.78% | 74.06% | 74.86% | 73.36% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
Z=0 | 87.26% | 78.61% | 77.80% | 77.70% | 78.28% | |
MNIST | Z=1 | 81.82% | 73.31% | 71.17% | 72.39% | 74.82% |
Z=5 | 64.74% | 56.56% | 55.56% | 53.39% | 49.91% | |
Z=0 | 74.08% | 68.15% | 67.85% | 67.12% | 67.53% | |
FMNIST | Z=1 | 72.50% | 61.67% | 61.74% | 61.03% | 62.38% |
Z=5 | 44.10% | 37.30% | 31.67% | 33.11% | 29.58% | |
Z=0 | 68.83% | 52.79% | 49.97% | 46.94% | 53.48% | |
CICIDS2017 | Z=1 | 53.94% | 44.24% | 45.35% | 43.71% | 48.95% |
Z=2 | 45.35% | 38.93% | 37.67% | 35.02% | 39.90% |
Mnist | Fmnist | CICIDS2017 | |
FedSC | 202 s | 202 s | 206 s |
FedAvg | 205 s | 201 s | 210 s |
FedNova | 195 s | 193 s | 206 s |
Scaffold | 264 s | 268 s | 272 s |
FedProx | 378 s | 368 s | 412 s |
Mnist | Fmnist | CICIDS2017 | |
FedSC | 1.21 M | 1.21 M | 0.41 M |
FedAvg | 1.21 M | 1.21 M | 0.41 M |
FedNova | 1.21 M | 1.21 M | 0.41 M |
Scaffold | 2.42 M | 2.42 M | 0.82 M |
FedProx | 1.21 M | 1.21 M | 0.41 M |