
Due to the depletion of fossil fuels and environmental concerns, renewable energy has become increasingly popular. Even so, the economic competitiveness and cost of energy in renewable systems remain a challenge. Optimization of renewable energy systems from an economic standpoint is important not only from the point of view of researchers but also industry owners, stakeholders, and governments. Solar collectors are one of the most optimized and developed renewable energy systems. However, due to the high degree of nonlinearity and many unknowns associated with these systems, optimizing them is an extremely time-consuming and expensive process. This study presents an economically optimal design platform for solar power plants with a fast response time using machine learning techniques. Compared with traditional mathematical optimization, the speed of economic optimization with the help of the machine learning method increased by up to 1100 times. A total of seven continuous variables and three discrete variables were selected for optimization of the parabolic trough solar collector. The objective functions were to optimize the exergy efficiency and the heat cost. As part of the environmental assessment, the cost of carbon dioxide emission was calculated based on the system's exergy and energy efficiencies. According to the sensitivity analysis, the mass flow of working fluid and the initial temperature of the fluid play the most significant roles. A simulated solar collector in Calgary was optimized in order to evaluate the applicability of the proposed platform.
Citation: Ali Omidkar, Razieh Es'haghian, Hua Song. Developing a machine learning model for fast economic optimization of solar power plants using the hybrid method of firefly and genetic algorithms, case study: optimizing solar thermal collector in Calgary, Alberta[J]. Green Finance, 2024, 6(4): 698-727. doi: 10.3934/GF.2024027
[1] | Qingjie Tan, Xujun Che, Shuhui Wu, Yaguan Qian, Yuanhong Tao . Privacy amplification for wireless federated learning with Rényi differential privacy and subsampling. Electronic Research Archive, 2023, 31(11): 7021-7039. doi: 10.3934/era.2023356 |
[2] | Xiaoyu Jiang, Ruichun Gu, Huan Zhan . Research on incentive mechanisms for anti-heterogeneous federated learning based on reputation and contribution. Electronic Research Archive, 2024, 32(3): 1731-1748. doi: 10.3934/era.2024079 |
[3] | Youqun Long, Jianhui Zhang, Gaoli Wang, Jie Fu . Hierarchical federated learning with global differential privacy. Electronic Research Archive, 2023, 31(7): 3741-3758. doi: 10.3934/era.2023190 |
[4] | Nihar Patel, Nakul Vasani, Nilesh Kumar Jadav, Rajesh Gupta, Sudeep Tanwar, Zdzislaw Polkowski, Fayez Alqahtani, Amr Gafar . F-LSTM: Federated learning-based LSTM framework for cryptocurrency price prediction. Electronic Research Archive, 2023, 31(10): 6525-6551. doi: 10.3934/era.2023330 |
[5] | Seyha Ros, Prohim Tam, Inseok Song, Seungwoo Kang, Seokhoon Kim . A survey on state-of-the-art experimental simulations for privacy-preserving federated learning in intelligent networking. Electronic Research Archive, 2024, 32(2): 1333-1364. doi: 10.3934/era.2024062 |
[6] | Qin Guo, Binlei Cai . Learning capability of the rescaled pure greedy algorithm with non-iid sampling. Electronic Research Archive, 2023, 31(3): 1387-1404. doi: 10.3934/era.2023071 |
[7] | Mengke Lu, Shang Gao, Xibei Yang, Hualong Yu . Improving performance of decision threshold moving-based strategies by integrating density-based clustering technique. Electronic Research Archive, 2023, 31(5): 2501-2518. doi: 10.3934/era.2023127 |
[8] | Weiwei Lai, Yinglong Zheng . Speech recognition of south China languages based on federated learning and mathematical construction. Electronic Research Archive, 2023, 31(8): 4985-5005. doi: 10.3934/era.2023255 |
[9] | Tej Bahadur Shahi, Cheng-Yuan Xu, Arjun Neupane, William Guo . Machine learning methods for precision agriculture with UAV imagery: a review. Electronic Research Archive, 2022, 30(12): 4277-4317. doi: 10.3934/era.2022218 |
[10] | Yuhang Liu, Jun Chen, Yuchen Wang, Wei Wang . Interpretable machine learning models for detecting fine-grained transport modes by multi-source data. Electronic Research Archive, 2023, 31(11): 6844-6865. doi: 10.3934/era.2023346 |
Due to the depletion of fossil fuels and environmental concerns, renewable energy has become increasingly popular. Even so, the economic competitiveness and cost of energy in renewable systems remain a challenge. Optimization of renewable energy systems from an economic standpoint is important not only from the point of view of researchers but also industry owners, stakeholders, and governments. Solar collectors are one of the most optimized and developed renewable energy systems. However, due to the high degree of nonlinearity and many unknowns associated with these systems, optimizing them is an extremely time-consuming and expensive process. This study presents an economically optimal design platform for solar power plants with a fast response time using machine learning techniques. Compared with traditional mathematical optimization, the speed of economic optimization with the help of the machine learning method increased by up to 1100 times. A total of seven continuous variables and three discrete variables were selected for optimization of the parabolic trough solar collector. The objective functions were to optimize the exergy efficiency and the heat cost. As part of the environmental assessment, the cost of carbon dioxide emission was calculated based on the system's exergy and energy efficiencies. According to the sensitivity analysis, the mass flow of working fluid and the initial temperature of the fluid play the most significant roles. A simulated solar collector in Calgary was optimized in order to evaluate the applicability of the proposed platform.
Nowadays, smart terminals, such as smart bracelets, are becoming increasingly popular. These devices are easy to carry and have many powerful sensors that can detect various inputs from the wearer's body [1,2]. These data have strong research value. Using these data to train machine learning models will hopefully make these smart terminals smarter to better serve people. As people pay increasing attention to the protection of data privacy [3,4], it is impossible and inadvisable to collect these private data, which leads to the problem of data silos [5]. However, if only local data is used for training, then the following problems emerge: 1) insufficient local data leads to model convergence difficulties and poor performance, and 2) limited diversity of data types hampers the model's generalization capability. As a solution, federated learning is proposed [6,7]. In federated learning, each participant uses their own private data to train the local model that is sent to the central server for aggregation to obtain the global model. Under the framework of federated learning, participants' private data are not exported locally, which protects participants' data privacy [8,9,10].
Although federated learning offers significant advantages in terms of data privacy protection compared to centralized training, it also faces numerous challenges [11,12]. Among these challenges, heterogeneity stands out as the most crucial one [13,14]. System heterogeneity arises when participants have varying storage capacities and computing power [15,16]. Statistical heterogeneity, on the other hand, emerges when participants exhibit different data type distributions and data volumes, leading to the non-IID (non-Independent and Identically Distributed) problem [17]. While hardware advancements have gradually addressed system heterogeneity, statistical heterogeneity remains a persistent challenge. Non-IID data are prevalent in real-life scenarios, and federated learning's performance experiences a notable decline when confronted with such data. The classical federated learning algorithm proposed by Google [18,19,20,21], FedAvg (Federated Averaging), fails to achieve superior performance across all clients compared to models trained locally by individual clients. Some participants even experience minimal benefits or inferior performance when participating in federated learning, discouraging their willingness to engage in the process. The existing FedAvg algorithm in federated learning no longer meets practical requirements, necessitating the development of a novel algorithm to address the statistical heterogeneity challenge posed by non-IID data distributions. This research aims to tackle this challenge and provide a solution in this paper.
The negative effects of non-IID data on FedAvg can be explained by the model parameters.
We define l(xi,yi;w) as the loss of the prediction on example (xi,yi) made with model parameters w. We aim for:
Minw∈Rdf(w)wheref(w)def=1n∑ni=1l(xi,yi;w) | (1.1) |
In the centralized machine learning environment, let w(c)t denote the weight after t−th update in the centralized machine learning. Then, the model parameters are updated in the following way:
w(c)t=w(c)t−1−η∇wf(w(c)t−1) | (1.2) |
In the federal learning environment, we assume there are K clients over which the data is partitioned, with Pk as the set of indexes of data points on client k, with nk=|Pk|. On each client, local training is conducted separately using local data:
w(k)t=w(k)t−1−η∇wf(w(k)t−1) | (1.3) |
Let w(f)t denote the weight calculated:
w(f)t=K∑k=1n(k)∑kk=1n(k)w(k)t | (1.4) |
Finally, we define:
weightdivergence=||w(f)t−w(c)t|| | (1.5) |
The divergence between w(k)t, w(f)t and w(c)t can be visualized from the Figures 1 and 2. When data is IID, for each client k, the data distribution is almost identical to the global data distribution and the divergence between w(k)t and w(c)t is small. Therefore, w(f)t obtained after aggregating different w(k)t according to Eq (1.4) also has very small divergence with w(c)t. After many iterations, w(f)t is still close to w(c)t and the weightdivergence is small. At this point, federated learning can perform very well. When data is non-IID, the large difference in the distribution of data owned by each client resulting in the divergence between w(k)t and w(k)t becomes larger and the divergence between w(k)t and w(c)t also becomes larger. Therefore, the divergence between w(f)t and w(c)t also becomes much larger and accumulates very fast. After many iterations, the weightdivergence becomes larger and larger. From the above analysis, it is concluded that the negative impact of non-IID data on federated learning is mainly due to the difference in data distribution of clients. Based on this, we improve the FedAvg algorithm and propose a new algorithm, which clusters clients with similar data distribution to find local IID data in the non-IID data distribution to solve the non-IID problem.
Contributions of this paper:
1) In this paper, FedSC algorithm is proposed to improve the accuracy of federated learning in the case of data imbalance, which helps to solve the problem of data heterogeneity in federated learning.
2) This paper reproduces a variety of federated learning algorithms, such as FedNova [22], SCAFFOLD [23], FedProx [17], etc. Through the comparisons, it demonstrates that FedSC consistently delivers strong performance across different scenarios.
3) The data training of the FedSC algorithm is performed only locally and no local data transmission is involved, which ensures data privacy and security for all participants, which helps to solve the problem of "data silos".
The structure of this paper is as follows. Section 2 presents the related work. Section 3 describes the FedSC algorithm flow in detail as well as the implementation details of each part. Section 4 presents the experimental results. Section 5 gives a summary of the full text.
Non-IID data distribution is common in real life [24], for example, different regions may have completely different vegetation distribution [25]. Due to data regulation and privacy concerns, meaningful real federated datasets are difficult to obtain [26]. In this paper, we use the Dirichlet distribution to simulate the non-IID distribution of data, where each client is allocated a proportion of the samples of each label according to Dirichlet distribution. Dirichlet distribution is an appropriate choice to simulate non-IID data distribution and has been used in many studies [22,27,28]. Specifically, we sample pk=DirN(β) and allocate a pk,j proportion of the instances of class k to the party j. Here, DirN(β) denotes the Dirichlet distribution and β is a concentration parameter (β>0). An advantage of this approach is that we can flexibly change the imbalance level by varying the concentration parameter β. If β is set to a smaller value, then the partition is more unbalanced.
The biggest challenge of federated learning is that its performance deteriorates when data are distributed as non-IID [29,30]. Many methods have been put forward to solve this problem [31,32,33,34]. The FedProx algorithm is proposed in [17], which improves the local objective function based on the FedAvg algorithm. The FedProx algorithm uses an additional adjustment term in the local objective function to limit the distance between the local model and the global model so that the local model will not be too scattered. This helps to avoid local inconsistencies and improves the generalization of the model. In addition, FedProx can balance the distance between the global and local models by adjusting the regularization hyperparameters to better accommodate different data distributions. However, its shortcomings are also obvious. The client needs to carefully adjust the proportion of the adjustment item in the local objective function. If the proportion is too small, then the adjustment has little effect. If the proportion is too large, then the local update is very small and the local model converges slowly. [22] considers that heterogeneity in the clients' local datasets and computation speeds results in large variations in the number of local updates performed by each client in each communication round. Simple aggregation of such models causes the global model updates are biased. To tackle this challenge, FedNova improve FedAvg in the aggregation stage, which normalizes and scales the local updates of each party according to their number of local steps before updating the global model. Therefore, the client can iterate autonomously to make better use of the information from local data, improving the generalization capability of the model. The FedNova algorithm achieves better performance on some non-IID datasets. Scaffold [23] introduces two variates (server variate and client variate), which are used to estimate the update direction of the server model and the update direction of each client. Then the difference between these two variables is used to approximate the bias in local training. Finally, the local updates are corrected by adding the difference to the local training. The Scaffold algorithm makes the federated learning model more stable, avoiding instability caused by the non-IID datasets. However, due to additional control variables, Scaffold doubles the size of each communication round compared to FedAvg. In [35], a federated learning method named FedCPF is proposed to be applied to vehicle edge computing scenarios in 6G communication networks. The method improves the efficiency and accuracy of federated learning by improving the communication method and optimizing the local training process, while reducing the communication overhead. There are also some researchers who aim to improve the distribution of non-IID data by sharing a certain percentage of the data, such as FedShare [36], Hybrid-FL [37], etc. Although the participants can be seen as trustworthy, direct data transfer still risks privacy leakage, which is against the original purpose of federated learning. In [38], the author divides clients according to the cosine similarity between the weight-updates of different clients to realize federated training of multiple models. However, it generates different personalized models for different clients rather than a unified model. In [39], TiFL was proposed, which divides participants into multiple levels according to the different performance of each participant. TiFL selects clients from the same level in every round of training in order to alleviate the discrete problem caused by the heterogeneity of resources and quantity owned by participants. The concept of client-side clustering is also employed in this paper to address the issue of non-IID data distribution. By classifying and aggregating clients based on their data distribution, the goal is to achieve a more balanced data distribution within each cluster. This approach aims to minimize the adverse effects of non-IID data on the performance of federated learning algorithms.
This section describes the FedSC algorithm in detail.
Non-IID data make local model parameters divergent in the federated learning so that the central server cannot aggregate a good model; however, federated learning can give satisfactory results with IID data. Thus, we cluster the clients according to their data distribution. The clients with high data distribution similarity are divided into a cluster; hence, data distribution in this cluster is similar to IID. At this time, federated optimization within each cluster can greatly improve the performance of the model and can improve the efficiency.
Let ni be the total amount of data samples owned by client i, nij be the number of class j data samples owned by client i, and define the data attribute of client i as Ii=[ni1/ni,ni2/ni,...,nij/ni]. Clients are clustered based on their data attributes.
A bottom-up clustering algorithm is proposed. Initially, each client is treated as an individual cluster, and clusters are progressively merged at each step. Ultimately, a cluster encompassing all samples is obtained. The detailed algorithm steps are as follows:
1) Treat each client as an individual cluster.
2) Calculate the distance between two clusters and merge the cluster with the smallest distance. In this case, the distance between two clusters is defined as the maximum distance between any pair of clients in the two clusters. Specifically, the distance between cluster p and cluster q is defined as follows:
Dp,q=max{dij=||Ii−Ij||2|i∈p,j∈q} | (3.1) |
where dij represents the distance between the data attributes Ii of client i in cluster p and the data attributes Ij of client j in cluster q.
3) Repeat step 2 and iteratively aggregate clusters until the number of clusters meets the requirements.
In this way, the data distribution among clients within the same cluster becomes similar. Consequently, federated learning can effectively operate within each cluster, yielding improved results.
In centralized machine learning, the model updates the model parameters by selecting a batch size of data at a time, as shown in Figure 3. All the data participate in the training process and each type of data makes its own contribution to the model fairly.
In FedAvg algorithm, the model parameters are updated by aggregating the model parameters of each client. The central server sends the initial model parameters w(f)t to the client. The client uses w(f)t as the model starting point to train with the local dataset and sends the trained model w(k)t to the central server. The central server aggregates the obtained models to obtain w(f)t+1. The training process of federated learning is shown in Figure 4.
There is a wide gap between federated learning and centralized machine learning with regard to the method of updating model parameters. Due to the clustering of clients, the traditional training process of the federated learning algorithm is no longer applicable. Therefore, we have made improvements to address this issue. We make the data in each cluster contribute to the global model of federated learning by imitating the batch processing of centralized machine learning. Treat the model aggregated by the central server from client models in a cluster as a model generated from batch-size data in centralized machine learning. Then, the central server transmits the aggregated model to the clients in another cluster for training, which corresponds to using other batch-size data to continue training the model in centralized machine learning as Figure 5.
First, each client sends its data attribute I to the central server. Because the information about the proportion of the client data categories may be leaked during this process, the client encrypts the data attribute I before sending it to the central server. After receiving data attributes from all clients and decrypting them, the central server uses the clustering algorithm to aggregate all clients into N clusters. Next, the central server selects a C−fraction of clients from the first cluster and send the initial global model parameters w(f)0 to them. After the clients train E epochs with their private data, the obtained local model parameters are uploaded to the central server. The central server performs a weighted average of the obtained model parameters according to the number of data samples of the clients to obtain the global model parameter w(f)1. Then, the central server selects a C−fraction of clients from the second cluster and sends the global model parameters to the selected clients. The selected client takes w(f)1 as the initial model parameter, uses the local dataset for training E epochs, and sends the trained model parameters to the central server. After obtaining the model parameters of all clients selected from the second cluster, the central server aggregates them to obtain the global model parameter w(f)2. Subsequently, w(f)2 is sent to the clients selected from the third cluster, and so on. This process is repeated for N aggregations, enabling the original model parameter to be trained with the data from N clusters and obtain the model parameter w(f)N. The model parameter w(f)N contains the knowledge learned from N cluster data. This constitutes a communication process. Next, the central server sends the model parameters w(f)N to the clients selected from the first cluster. This process is repeated until the model converges. In centralized machine learning, all data types are traversed in batch-size units. In this paper's proposed algorithm, all data types are traversed in units of data from clients selected from each cluster. In the FedSC algorithm, data training only occurs locally on the client side. The client and server solely exchange model parameters, without sharing any data. This approach mitigates the risk of data leakage to a certain extent and ensures the data security of clients. The schematic diagram of the FedSC algorithm is shown in Figure 6 and the pseudocode of the FedSC algorithm is shown in Algorithm 1.
Algorithm 1: FedSC algorithm |
Input: Local dataset for each client Output: FedSC algorithm Each client sends its own data attributes I to the central server; Serverexecution: Cluster according to data attribute I of each client; Initialize the global model parameter w0; ![]() |
To investigate the effectiveness of FedSC algorithms on non-IID data setting, we conduct extensive experiments on three public datasets, including two image datasets (i.e., MNIST [40], FEMNIST [41]) and one tabular dataset (we collect 6000 pieces each of 10 kinds of attack flows from the network security dataset CICIDS2017 [42] as the training set, and 1000 pieces each of the corresponding attack flows as the test set). The statistics of the datasets are summarized in Table 1. We use an MLP with three hidden layers as the base model. We also use the SGD optimizer with learning rate 0.01 and the batch size is set to 64. Furthermore, we reproduce FedAvg, FedProx, Scaffold, and FedNova algorithms and run all the algorithms for the same number of rounds for fair comparison. By default, the number of rounds is set to 100, the number of clients is set to 100(P), the number of local epochs is set to 1(E), the percentage of clients selected is set to 1(C), the β of Dirichlet distribution is set to 0.5(D), the added noise is set to 0(Z) and the number of clustering is set to 10(G), unless state otherwise. Because the selection of clients in federated learning has a certain degree of randomness, the average accuracy rate is used as the standard instead of the highest accuracy rate. The model building and simulation in this paper are implemented on Python3.7. The pytorch-gpu1.10.1 framework is mainly used to build the model.
Datasets | Training instances | Test instance | Features | Classes |
MNIST | 60,000 | 10,000 | 784 | 10 |
FMNIST | 60,000 | 10,000 | 784 | 10 |
CICIDS2017 | 60,000 | 10,000 | 256 | 10 |
The order of data samples has an important impact on machine learning. Here, we explore the influence of data sample order on FedSC algorithm. When data samples are arranged in a certain rule, machine learning learns this rule as a feature, which leads to overfitting. The clients are clustered and experimented, and then the order of the clusters is shuffled and experimented again for comparison. The experimental results are shown in Figure 7.
It can be seen from Figure 7 that the order of data samples has little influence on FedSC algorithm. In the FedSC algorithm, clients are randomly selected in each round of communication and aggregated in the cluster. These operations effectively mitigate overfitting, minimize the influence of data sample order on the model, and enhance the model's generalization ability.
In the FedSC algorithm, it is important to determine the optimal number of client clusters. In this experiment, the data is divided into 100 clients, and the number of clusters G is varied as 1, 10, 50, and 100. The experimental results are shown in Figure 8.
As can be seen from Figure 8, it is reasonable that the more clusters there are, the better the final result will be. In the FedSC algorithm, the model aggregated within each cluster is treated as a model trained on a batch-size of data in traditional machine learning. When the number of client clusters is small, each cluster contains a larger number of clients, resulting in scattered data distributions among clients within the cluster. This makes it challenging to aggregate a high-quality model, leading to poor performance of the final global model. When the number of clusters is large, a better model can be aggregated, but a large number of serial training consumes a long time. When G = 1, the FedSC algorithm becomes the FedAvg algorithm, which is affected by the non-IID and cannot aggregate a good global model. When G = 100, each client is equivalent to batch-size in traditional machine learning, and better results can be obtained at the cost of consuming more time. In order to balance the effect of the model and the cost of time consumed, the following experiment set the number of client clusters to 10, that is, the number of data types.
The number of clients participating in federated learning has been increasing due to the advantages it offers in addressing data uncontrollability and data leakage issues associated with traditional centralized training. Thus, we verify the effect of the number of clients on the algorithm; that is, the effect of the hyperparameter P in the algorithm on the experiment, which controls the number of clients in federated learning. The results obtained are shown in Table 2 and Figure 9.
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
p=10 | 91.44% | 86.57% | 87.90% | 90.26% | 86.51% | |
MNIST | p=100 | 88.96% | 78.13% | 77.61% | 76.20% | 77.16% |
p=1000 | 73.40% | 44.21% | 44.14% | 47.13% | 41.98% | |
p=10 | 80.33% | 77.60% | 78.31% | 79.45% | 78.00% | |
FMNIST | p=100 | 65.56% | 68.47% | 68.34% | 66.69% | 67.82% |
p=1000 | 66.65% | 53.05% | 49.99% | 47.67% | 53.16% | |
p=10 | 81.20% | 69.27% | 69.59% | 71.94% | 69.38% | |
CICIDS2017 | p=100 | 58.59% | 49.15% | 51.16% | 50.28% | 49.18% |
p=1000 | 48.77% | 25.48% | 26.47% | 25.39% | 28.59% |
It can be seen from Table 2 and Figure 9 that the number of clients has a great impact on federated learning, but the FedSC algorithm is better than other algorithms. The accuracy of the FedAvg, FedNova, Scaffold and FedProx algorithms decreases significantly as the number of clients increases. For instance, when the number of clients increases from 10 to 1000, the accuracy decreases by almost half. When the total data volume is certain, the more clients there are, the less data volume each client has. Therefore, the local model trained by each client may not be accurate enough, resulting in a degradation of the performance of the global model. The accuracy of FedSC algorithm decreases as the number of clients increases, but it is still within an acceptable range. Too many clients also means more communication, which leads to increased communication overhead. Too many clients can also lead to an increase in federated learning training time, as all clients need to be trained before model aggregation can take place. Therefore, reasonable control of the number of clients is very important for the performance and efficiency of federated learning. It is important to ensure that the model can be adequately trained from diverse data while also ensuring training speed and communication efficiency.
Now, we verify the effect of different data distributions on federated learning, i.e., the effect of the hyperparameter D on the experiments. When D becomes small, the data distribution is biased towards non-IID. When D becomes large, the data distribution is biased towards IID. The results are shown in Table 3 and Figure 10.
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
D−>0 | 81.34% | 74.17% | 75.70% | 78.61% | 72.65% | |
MNIST | D=0.1 | 84.12% | 77.40% | 77.70% | 76.13% | 76.80% |
D−>∞ | 90.26% | 76.08% | 77.56% | 78.54% | 77.42% | |
D−>0 | 64.18% | 62.46% | 62.90% | 64.81% | 60.58% | |
FMNIST | D=0.1 | 68.23% | 66.60% | 68.28% | 67.15% | 65.88% |
D−>∞ | 80.38% | 66.55% | 66.43% | 66.83% | 66.85% | |
D−>0 | 47.99% | 31.70% | 46.35% | 47.46% | 31.77% | |
CICIDS2017 | D=0.1 | 51.70% | 43.84% | 48.70% | 49.14% | 42.39% |
D−>∞ | 80.52% | 56.53% | 56.55% | 56.64% | 56.05% |
It can be seen from Table 3 and Figure 10 that the FedSC algorithm shows better results regardless of whether the data distribution is biased towards IID or non-IID. Additionally, all algorithms exhibit better performance when the data distribution is biased towards IID compared to when it is biased towards non-IID. In [36], it is pointed out that forcing the average fusion of model parameters with large differences will lead to a decrease in model accuracy. When the data distribution is biased towards non-IID, the dataset owned by each client are obviously different, resulting in different models trained by each client. At this time, using the FedAvg algorithm, the model parameters provided by each client vary greatly and the model after server aggregation will have significant deviations, leading to model performance degradation. Although the FedProx, FedNova and Scaffold algorithms have improved the FedAvg algorithm, the results are still less than satisfactory. In the FedSC algorithm, clients in each cluster have roughly the same data so that the local model of clients in each cluster is roughly the same. The central server aggregates these local models from each cluster, resulting in a global model that incorporates more comprehensive characteristics of such data. As a result, the FedSC algorithm achieves better and more stable performance.
Here, we verify the influence of the number of clients selected in each round of communication on the algorithm; that is, the influence of the hyperparameter C in the algorithm on the experiment, which controls the number of parallel clients in federated learning. Each round, C of all clients are randomly selected from all clients. The results obtained are shown in Table 4 and Figure 11.
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
C=0.2 | 83.39% | 76.06% | 75.30% | 76.70% | 76.70% | |
MNIST | C=0.6 | 86.17% | 76.01% | 76.92% | 77.15% | 76.57% |
C=1.0 | 86.68% | 77.72% | 77.48% | 76.71% | 77.85% | |
C=0.2 | 73.29% | 65.03% | 64.63% | 66.20% | 65.65% | |
FMNIST | C=0.6 | 71.81% | 67.27% | 67.31% | 67.16% | 67.31% |
C=1.0 | 71.84% | 67.55% | 65.91% | 66.19% | 67.06% | |
C=0.2 | 65.18% | 45.16% | 48.24% | 51.25% | 43.99% | |
CICIDS2017 | C=0.6 | 66.93% | 48.73% | 51.49% | 51.99% | 46.38% |
C=1.0 | 68.14% | 46.96% | 51.20% | 50.32% | 49.76% |
It can be seen from Table 4 and Figure 11 that the FedSC algorithm consistently achieves a higher accuracy rate compared to other algorithms in all scenarios. This superiority can be attributed to the fact that the FedSC algorithm ensures that the data within each cluster is almost IID, resulting in similar parameters among local models. Consequently, a global model with improved performance can be effectively aggregated. However, the training process of the FedSC algorithm is very unstable and the curve fluctuates a lot when fewer clients are selected each time, which is a drawback of the FedSC. The Scaffold algorithm is very stable in the training process with fewer selected clients, but its communication per round increases due to the additional variables introduced by the Scaffold algorithm.
It can also be seen that regardless of the FedSC algorithm or other algorithms, the more clients selected each time, the smaller the fluctuation of the training result, the more stable the training process, and the smoother the training curve. This is due to the fact that selecting more clients increases the number of data samples seen in each training round, resulting in more accurate results. Therefore, there will be no large fluctuations after aggregation.
Increasing the number of clients selected in each round has a positive impact on all algorithms, as it improves stability and accuracy. Selecting more clients per round of training is an effective approach to enhance performance. However, because each client needs to upload or download model parameters from the central server for each round of training, selecting more clients means more traffic. The FedSC algorithm can achieve relatively good results when C is small, which means that few clients are needed to be selected in federated learning to achieve the desired accuracy and fewer clients means a reduction in total traffic. This is helpful to solve the problem of communication bottlenecks in federated learning.
Here, we verify the effect of the number of rounds updated locally by each client on federated learning, i.e., the effect of the hyperparameter E on the experiments. The results are shown in Table 5 and Figure 12.
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
E=1 | 86.07% | 77.12% | 76.47% | 75.63% | 78.05% | |
MNIST | E=10 | 92.31% | 87.50% | 87.66% | 89.99% | 87.34% |
E=20 | 93.46% | 89.01% | 89.39% | 91.44% | 88.93% | |
E=1 | 77.15% | 67.63% | 66.45% | 67.37% | 67.08% | |
FMNIST | E=10 | 80.85% | 78.79% | 79.09% | 80.28% | 78.38% |
E=20 | 82.58% | 80.08% | 80.56% | 81.83% | 79.91% | |
E=1 | 67.77% | 50.71% | 50.80% | 50.92% | 50.52% | |
CICIDS2017 | E=10 | 78.33% | 68.88% | 69.29% | 70.80% | 68.92% |
E=20 | 80.05% | 73.78% | 74.06% | 74.86% | 73.36% |
As can be seen from Table 5 and Figure 12, with the change of the number of local training rounds, the FedSC algorithm keeps a strong momentum. The average accuracy of FedSC algorithm is still at a high level and the training process is relatively stable, which is sufficient to demonstrate the superiority of the proposed FedSC algorithm.
It can be seen that the increase in the number of local training rounds has a certain improvement for all algorithms. When there are fewer local training rounds, the client may not be trained sufficiently locally and the uploaded model parameters may not be accurate enough, thus affecting the performance of the global model. Increasing the number of local training rounds enables the local model to learn the features of the local data of the client better, and the effect of the global model is naturally better and more stable. Therefore, increasing the number of local training rounds is a good way to improve the accuracy and stability of the algorithm. However, a larger number of local training epochs increases the local computation of the client, which increases the training time and communication burden. Therefore, the selection of local training rounds is crucial to the performance of federated learning, and it needs to consider the factors such as data set, model complexity, computing and communication resources.
Federated learning techniques are vulnerable to Byzantine failures [43,44,45], biased local datasets, and poisoning attacks. Here, we verify the performance of federated learning algorithm when the client is disturbed by noise. Gaussian noise is added to the dataset to make the client training results inaccurate. Gaussian noise is popular [46], especially in images. Add Gaussian noise to each pixel point in the image data and Gaussian noise to each data in the table data. The expression of Gaussian noise is ˆx∼Gau(μ,σ), set the mean value of Gaussian noise u=0 and change the variance of Gaussian noise σ. It can be seen from Figure 13 that the larger the variance of the added Gaussian noise, the blurrier the image.
Set the variance of Gaussian noise to 0, 1 and 5 for the mnist dataset and fmnist dataset, respectively, and set the variance of Gaussian noise to 0, 1 and 2 for the CICIDS2017 dataset. The experimental results are shown in Table 6 and Figure 14.
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
Z=0 | 87.26% | 78.61% | 77.80% | 77.70% | 78.28% | |
MNIST | Z=1 | 81.82% | 73.31% | 71.17% | 72.39% | 74.82% |
Z=5 | 64.74% | 56.56% | 55.56% | 53.39% | 49.91% | |
Z=0 | 74.08% | 68.15% | 67.85% | 67.12% | 67.53% | |
FMNIST | Z=1 | 72.50% | 61.67% | 61.74% | 61.03% | 62.38% |
Z=5 | 44.10% | 37.30% | 31.67% | 33.11% | 29.58% | |
Z=0 | 68.83% | 52.79% | 49.97% | 46.94% | 53.48% | |
CICIDS2017 | Z=1 | 53.94% | 44.24% | 45.35% | 43.71% | 48.95% |
Z=2 | 45.35% | 38.93% | 37.67% | 35.02% | 39.90% |
The addition of noise will disturb the information in the data set, which will have a negative impact on the performance of federated learning. With the addition of Gaussian noise, the data in the dataset becomes more complex and difficult to process, making it difficult for the client to capture the features of the data during local training. Therefore, adding noise makes the training results of some clients unreliable or inaccurate, thus affecting the update results of the global model of the federated learning algorithm. As can be seen from Figure 14, the performance of all federated learning algorithms declines as the level of noise increases. Although the FedSC algorithm is also affected by noise, compared with other algorithms FedSC algorithm still gives better results. The addition of Gaussian noise can increase the privacy of data, but it will also affect the performance of the model.
To compare the efficiency of different FL algorithms, we set P = 100, D = 0.5 C = 1, E = 1, Z = 0 for experiment. In Table 7, the total computation time of 100 communication rounds is shown, and in Table 8, the communication cost per client in each communication round is shown.
Mnist | Fmnist | CICIDS2017 | |
FedSC | 202 s | 202 s | 206 s |
FedAvg | 205 s | 201 s | 210 s |
FedNova | 195 s | 193 s | 206 s |
Scaffold | 264 s | 268 s | 272 s |
FedProx | 378 s | 368 s | 412 s |
Mnist | Fmnist | CICIDS2017 | |
FedSC | 1.21 M | 1.21 M | 0.41 M |
FedAvg | 1.21 M | 1.21 M | 0.41 M |
FedNova | 1.21 M | 1.21 M | 0.41 M |
Scaffold | 2.42 M | 2.42 M | 0.82 M |
FedProx | 1.21 M | 1.21 M | 0.41 M |
We can observe that the time consumed by the FedSC, Fedavg and FedNove algorithms is very close. However, the Scaffold algorithm introduces additional operations during training, resulting in slightly increased time consumption. FedProx directly modifies the objective, which causes additional computation overhead in the gradient descent of each batch and therefore takes the longest time. For the communication cost, the FedSC, FedAvg, FedNove and FedProx algorithms upload only their local model parameters per round, so the communication cost is the same. However, the Scaffold algorithm transmits additional control variables in each round, so its communication cost is twice that of the other algorithms.
Federated learning plays a pivotal role in the field of machine learning, particularly as the importance of data privacy continues to grow. Unfortunately, the accuracy of federated learning is significantly influenced by the distribution of data. When the data exhibits non-IID characteristics, the performance of federated learning deteriorates. This paper introduces a novel federated learning algorithm called FedSC, which outperforms other algorithms in non-IID data scenarios. The FedSC algorithm aims to identify local IID data within non-IID data to mitigate the adverse impact of non-IID data on federated learning. Through experimental comparisons, the FedSC algorithm consistently achieves higher accuracy compared to alternative federated learning algorithms. Moreover, the FedSC algorithm ensures data security by completing the model training without sharing data among participating parties. This contribution further advances the development of federated learning.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
The authors declare there is no conflicts of interest.
[1] |
Alawi OA, Kamar HM, Abdelrazek AH, et al. (2024) Design optimization of solar collectors with hybrid nanofluids: An integrated ansys and machine learning study. Sol Energ Mat Sol C 271: 112822. https://doi.org/https://doi.org/10.1016/j.solmat.2024.112822 doi: 10.1016/j.solmat.2024.112822
![]() |
[2] |
Alawi OA, Kamar HM, Salih SQ, et al. (2024) Development of optimized machine learning models for predicting flat plate solar collectors thermal efficiency associated with Al2O3-water nanofluids. Eng Appl Artif Intel 133: 108158. https://doi.org/https://doi.org/10.1016/j.engappai.2024.108158 doi: 10.1016/j.engappai.2024.108158
![]() |
[3] |
Ali B (2018) Forecasting model for water-energy nexus in Alberta, Canada. Water-Energy Nexus 1: 104–115. https://doi.org/10.1016/j.wen.2018.08.002 doi: 10.1016/j.wen.2018.08.002
![]() |
[4] |
Ashouri M, Khoshkar Vandani AM, Mehrpooya M, et al. (2015) Techno-economic assessment of a Kalina cycle driven by a parabolic Trough solar collector. Energ Convers Manage 105: 1328–1339. https://doi.org/https://doi.org/10.1016/j.enconman.2015.09.015 doi: 10.1016/j.enconman.2015.09.015
![]() |
[5] |
Azad Gilani H, Hoseinzadeh S (2021) Techno-economic study of compound parabolic collector in solar water heating system in the northern hemisphere. Appl Therm Eng 190: 116756. https://doi.org/https://doi.org/10.1016/j.applthermaleng.2021.116756 doi: 10.1016/j.applthermaleng.2021.116756
![]() |
[6] |
Badescu V (2018) How much work can be extracted from diluted solar radiation? Sol Energy 170: 1095-1100. https://doi.org/10.1016/j.solener.2018.05.094 doi: 10.1016/j.solener.2018.05.094
![]() |
[7] | Barrington-Leigh C, Ouliaris M (2017) The renewable energy landscape in Canada: a spatial analysis. Renew Sust Energ Rev 75: 809–819. |
[8] | Bellos E, Tzivanidis C (2017) A detailed exergetic analysis of parabolic trough collectors. Energ Convers Manage 149: 275–292. |
[9] | Bergman TL (2011) Fundamentals of heat and mass transfer. John Wiley & Sons. |
[10] |
Brenner A, Kahn J, Hirsch T, et al. (2023) Soiling determination for parabolic trough collectors based on operational data analysis and machine learning. Sol Energy 259: 257–276. https://doi.org/https://doi.org/10.1016/j.solener.2023.05.008 doi: 10.1016/j.solener.2023.05.008
![]() |
[11] |
Cuce PM, Cuce E (2023) Performance analysis of fresnel lens driven hot water/steam generator for domestic and industrial use: A CFD research. Hittite J Sci Eng 10: 1–9. https://doi.org/10.17350/HJSE19030000285 doi: 10.17350/HJSE19030000285
![]() |
[12] |
Cuce PM, Cuce E, Guclu T, et al. (2021) Energy saving aspects of green facades: Current applications and challenges. Green Build Constr Econ 2021: 18–28. https://doi.org/10.37256/gbce.2220211007 doi: 10.37256/gbce.2220211007
![]() |
[13] |
Cuce PM, Guclu T, Cuce E (2024) Design, modelling, environmental, economic and performance analysis of parabolic trough solar collector (PTC) based cogeneration systems assisted by thermoelectric generators (TEGs). Sustain Energy Techn 64: 103745. https://doi.org/https://doi.org/10.1016/j.seta.2024.103745 doi: 10.1016/j.seta.2024.103745
![]() |
[14] |
Deniz E, Çınar S (2016) Energy, exergy, economic and environmental (4E) analysis of a solar desalination system with humidification-dehumidification. Energ Convers Manage 126: 12–19. https://doi.org/10.1016/j.enconman.2016.07.064 doi: 10.1016/j.enconman.2016.07.064
![]() |
[15] |
Desai NB, Pranov H, Haglind F (2021) Techno-economic analysis of a foil-based solar collector driven electricity and fresh water generation system. Renew Energy 165: 642–656. https://doi.org/https://doi.org/10.1016/j.renene.2020.11.043 doi: 10.1016/j.renene.2020.11.043
![]() |
[16] | Duffie JA, Beckman WA, Blair N (2020) Solar engineering of thermal processes, photovoltaics and wind. John Wiley & Sons. |
[17] |
El-Shorbagy MA, El-Refaey AM (2022) A hybrid genetic–firefly algorithm for engineering design problems. J Comput Des Eng 9: 706–730. https://doi.org/10.1093/jcde/qwac013 doi: 10.1093/jcde/qwac013
![]() |
[18] |
Elfeky KE, Wang Q (2023) Techno-environ-economic assessment of photovoltaic and CSP with storage systems in China and Egypt under various climatic conditions. Renew Energy 215: 118930. https://doi.org/https://doi.org/10.1016/j.renene.2023.118930 doi: 10.1016/j.renene.2023.118930
![]() |
[19] |
Elsheikh A, Zayed M, Aboghazala A, et al. (2024) Innovative solar distillation system with prismatic absorber basin: Experimental analysis and LSTM machine learning modeling coupled with great wall construction algorithm. Process Saf Environ Prot 186: 1120–1133. https://doi.org/https://doi.org/10.1016/j.psep.2024.04.063 doi: 10.1016/j.psep.2024.04.063
![]() |
[20] |
Faizal M, Saidur R, Mekhilef S, et al. (2015) Energy, economic, and environmental analysis of a flat-plate solar collector operated with SiO2 nanofluid. Clean Technol Envir 17: 1457–1473. https://doi.org/10.1007/s10098-014-0870-0 doi: 10.1007/s10098-014-0870-0
![]() |
[21] |
Fister I, Fister Jr I, Yang XS, et al. (2013) A comprehensive review of firefly algorithms. Swarm Evol Comput 13: 34–46. https://doi.org/10.1016/j.swevo.2013.06.001 doi: 10.1016/j.swevo.2013.06.001
![]() |
[22] | Forristall R (2003) Heat transfer analysis and modeling of a parabolic trough solar receiver implemented in engineering equation solver. |
[23] | Frangopoulos CA (1987) Thermo-economic functional analysis and optimization. Energy 12: 563–571. |
[24] | García-García JC, Ponce-Rocha JD, Marmolejo-Correa D, et al. (2019) Exergy analysis for energy integration in a bioethanol production process to determine heat exchanger networks feasibility. In A. A. Kiss, E. Zondervan, R. Lakerveld, & L. Özkan (Eds.), Computer Aided Chemical Engineering, 46: 475–480. Elsevier. https://doi.org/https://doi.org/10.1016/B978-0-12-818634-3.50080-1 |
[25] | Gnielinski V (1976) New equations for heat and mass transfer in turbulent pipe and channel flow. Int J Chem Eng 16: 359–367. |
[26] | Haykin S (1998) Neural networks: a comprehensive foundation. Prentice Hall PTR. |
[27] | Jehring L (1992) Bejan, A., Advanced Engineering Thermodynamics. New York etc., John Wiley & Sons 1988. XXIII, 758. ISBN 0‐471‐83043‐7, Wiley Online Library. |
[28] | Johari NF, Zain AM, Noorfa MH, et al. (2013) Firefly algorithm for optimization problem. Appl Mech Mater 421: 512–517. |
[29] |
Kalogirou SA (2004) Solar thermal collectors and applications. Prog Energ Combust Sci 30: 231–295. https://doi.org/10.1016/j.pecs.2004.02.001 doi: 10.1016/j.pecs.2004.02.001
![]() |
[30] |
Kottala RK, Balasubramanian KR, Jinshah BS, et al. (2023) Experimental investigation and machine learning modelling of phase change material-based receiver tube for natural circulated solar parabolic trough system under various weather conditions. J Therm Anal Calorim 148: 7101–7124. https://doi.org/10.1007/s10973-023-12219-9 doi: 10.1007/s10973-023-12219-9
![]() |
[31] | Landsberg P, Mallinson J (1976) Thermodynamic constraints, effective temperatures and solar cells. International Conference on Solar Electricity. |
[32] | Mazen F, AbulSeoud RA, Gody AM (2016) Genetic algorithm and firefly algorithm in a hybrid approach for breast cancer diagnosis. Int J Comput Trends Tech 32: 62–68. |
[33] |
Mehdipour R, Baniamerian Z, Golzardi S, et al. (2020) Geometry modification of solar collector to improve performance of solar chimneys. Renew Energy 162: 160–170. https://doi.org/https://doi.org/10.1016/j.renene.2020.07.151 doi: 10.1016/j.renene.2020.07.151
![]() |
[34] |
Mustafa J, Alqaed S, Sharifpur M (2022) Numerical study on performance of double-fluid parabolic trough solar collector occupied with hybrid non-Newtonian nanofluids: Investigation of effects of helical absorber tube using deep learning. Eng Anal Bound Elem 140: 562–580. https://doi.org/https://doi.org/10.1016/j.enganabound.2022.04.033 doi: 10.1016/j.enganabound.2022.04.033
![]() |
[35] |
Nguyen HM, Omidkar A, Li W, et al. (2023) Non-thermal plasma assisted catalytic nitrogen fixation with methane at ambient conditions. Chem Eng J 471: 144748. https://doi.org/https://doi.org/10.1016/j.cej.2023.144748 doi: 10.1016/j.cej.2023.144748
![]() |
[36] |
Omidkar A, Alagumalai A, Li Z, et al. (2024) Machine learning assisted techno-economic and life cycle assessment of organic solid waste upgrading under natural gas. Appl Energy 355: 122321. https://doi.org/https://doi.org/10.1016/j.apenergy.2023.122321 doi: 10.1016/j.apenergy.2023.122321
![]() |
[37] |
Omidkar A, Haddadian K, Es'haghian R, et al. (2024) Novel energy efficient in-situ bitumen upgrading technology to facilitate pipeline transportation using natural gas: Sustainability evaluation using a new hybrid approach based on fuzzy multi-criteria decision-making tool and techno-economic and life cycle assessment. Energy 297: 131280. https://doi.org/https://doi.org/10.1016/j.energy.2024.131280 doi: 10.1016/j.energy.2024.131280
![]() |
[38] |
Omidkar A, Xu H, Li Z, et al. (2023) Techno-economic and life cycle assessment of renewable diesel production via methane-assisted catalytic waste cooking oil upgrading. J Clean Prod 414: 137512. https://doi.org/https://doi.org/10.1016/j.jclepro.2023.137512 doi: 10.1016/j.jclepro.2023.137512
![]() |
[39] |
Pal RK, K RK (2021) Investigations of thermo-hydrodynamics, structural stability, and thermal energy storage for direct steam generation in parabolic trough solar collector: A comprehensive review. J Clean Product 311: 127550. https://doi.org/https://doi.org/10.1016/j.jclepro.2021.127550 doi: 10.1016/j.jclepro.2021.127550
![]() |
[40] | Patel S, Parkins JR (2023) Assessing motivations and barriers to renewable energy development: Insights from a survey of municipal decision-makers in Alberta, Canada. Energy Rep 9: 5788–5798. |
[41] |
Pourasl HH, Barenji RV, Khojastehnezhad VM (2023) Solar energy status in the world: A comprehensive review. Energy Rep 10: 3474–3493. https://doi.org/https://doi.org/10.1016/j.egyr.2023.10.022 doi: 10.1016/j.egyr.2023.10.022
![]() |
[42] |
Ruiz-Moreno S, Sanchez AJ, Gallego AJ, et al. (2022) A deep learning-based strategy for fault detection and isolation in parabolic-trough collectors. Renew Energy 186: 691–703. https://doi.org/https://doi.org/10.1016/j.renene.2022.01.029 doi: 10.1016/j.renene.2022.01.029
![]() |
[43] |
Salari A, Shakibi H, Soltani S, et al. (2024) Optimization assessment and performance analysis of an ingenious hybrid parabolic trough collector: A machine learning approach. Appl Energy 353: 122062. https://doi.org/https://doi.org/10.1016/j.apenergy.2023.122062 doi: 10.1016/j.apenergy.2023.122062
![]() |
[44] |
Shafieian A, Parastvand H, Khiadani M (2020) Comparative and performative investigation of various data-based and conventional theoretical methods for modelling heat pipe solar collectors. Sol Energy 198: 212–223. https://doi.org/https://doi.org/10.1016/j.solener.2020.01.056 doi: 10.1016/j.solener.2020.01.056
![]() |
[45] |
Shboul B, Zayed ME, Al-Tawalbeh N, et al. (2024) Dynamic numerical modeling and performance optimization of solar and wind assisted combined heat and power system coupled with battery storage and sophisticated control framework. Results Eng 22: 102198. https://doi.org/https://doi.org/10.1016/j.rineng.2024.102198 doi: 10.1016/j.rineng.2024.102198
![]() |
[46] |
Shboul B, Zayed ME, Tariq R, et al. (2024) New hybrid photovoltaic-fuel cell system for green hydrogen and power production: Performance optimization assisted with Gaussian process regression method. Int J Hydrogen Energ 59: 1214–1229. https://doi.org/https://doi.org/10.1016/j.ijhydene.2024.02.087 doi: 10.1016/j.ijhydene.2024.02.087
![]() |
[47] |
Sultan AJ, Hughes KJ, Ingham DB, et al. (2020) Techno-economic competitiveness of 50 MW concentrating solar power plants for electricity generation under Kuwait climatic conditions. Renew Sust Energ Rev 134: 110342. https://doi.org/10.1016/j.rser.2020.110342 doi: 10.1016/j.rser.2020.110342
![]() |
[48] |
Tabarhoseini SM, Sheikholeslami M, Said Z (2022) Recent advances on the evacuated tube solar collector scrutinizing latest innovations in thermal performance improvement involving economic and environmental analysis. Sol Energ Mat Sol C 241: 111733. https://doi.org/https://doi.org/10.1016/j.solmat.2022.111733 doi: 10.1016/j.solmat.2022.111733
![]() |
[49] |
Vakili M, Salehi SA (2023) A review of recent developments in the application of machine learning in solar thermal collector modelling. Environ Sci Pollut Res 30: 2406–2439. https://doi.org/10.1007/s11356-022-24044-y doi: 10.1007/s11356-022-24044-y
![]() |
[50] | Vapnik VN (1999) An overview of statistical learning theory. Ieee T Neural Networ 10: 988–999. |
[51] |
Wahid F, Alsaedi AKZ, Ghazali R (2019) Using improved firefly algorithm based on genetic algorithm crossover operator for solving optimization problems. J Intell Fuzzy Syst 36: 1547–1562. https://doi.org/10.3233/JIFS-181936 doi: 10.3233/JIFS-181936
![]() |
[52] |
Wahid F, Ghazali R, Ismail LH (2019) Improved firefly algorithm based on genetic algorithm operators for energy efficiency in smart buildings. Arab J Sci Eng 44: 4027–4047. https://doi.org/10.1007/s13369-019-03759-0 doi: 10.1007/s13369-019-03759-0
![]() |
[53] |
Wang Y, Kandeal AW, Swidan A, et al. (2021) Prediction of tubular solar still performance by machine learning integrated with Bayesian optimization algorithm. Appl Therm Eng 184: 116233. https://doi.org/https://doi.org/10.1016/j.applthermaleng.2020.116233 doi: 10.1016/j.applthermaleng.2020.116233
![]() |
[54] |
Wu S, Wang C, Tang R (2022) Optical efficiency and performance optimization of a two-stage secondary reflection hyperbolic solar concentrator using machine learning. Renew Energy 188: 437–449. https://doi.org/https://doi.org/10.1016/j.renene.2022.01.117 doi: 10.1016/j.renene.2022.01.117
![]() |
[55] |
Yang XS (2010) Firefly algorithm, stochastic test functions and design optimisation. Int J Bio-Inspir Com 2: 78–84. https://doi.org/10.1504/IJBIC.2010.032124 doi: 10.1504/IJBIC.2010.032124
![]() |
[56] |
Zayed ME, Aboelmaaref MM, Chazy M (2023a) Design of solar air conditioning system integrated with photovoltaic panels and thermoelectric coolers: Experimental analysis and machine learning modeling by random vector functional link coupled with white whale optimization. Therm Sci Eng Prog 44: 102051. https://doi.org/https://doi.org/10.1016/j.tsep.2023.102051 doi: 10.1016/j.tsep.2023.102051
![]() |
[57] |
Zayed ME, Kabeel AE, Shboul B, et al. (2023b) Performance augmentation and machine learning-based modeling of wavy corrugated solar air collector embedded with thermal energy storage: Support vector machine combined with Monte Carlo simulation. J Energy Storage 74: 109533. https://doi.org/https://doi.org/10.1016/j.est.2023.109533 doi: 10.1016/j.est.2023.109533
![]() |
[58] |
Zayed ME, Zhao J, Li W, et al. (2021) Predicting the performance of solar dish Stirling power plant using a hybrid random vector functional link/chimp optimization model. Solar Energy 222: 1–17. https://doi.org/https://doi.org/10.1016/j.solener.2021.03.087 doi: 10.1016/j.solener.2021.03.087
![]() |
1. | Zhifu Huang, Zihao Wei, Jinyang Wang, FL-AGN: A Privacy-Enhanced Federated Learning Method Based on Adaptive Gaussian Noise for Resisting Gradient Inference Attacks, 2024, 12, 2169-3536, 101366, 10.1109/ACCESS.2024.3431031 | |
2. | Hend Alshede, Kamal Jambi, Laila Nassef, Nahed Alowidi, Etimad Fadel, FedAvg-P: Performance-Based Hierarchical Federated Learning-Based Anomaly Detection System Aggregation Strategy for Advanced Metering Infrastructure, 2024, 24, 1424-8220, 5492, 10.3390/s24175492 |
Datasets | Training instances | Test instance | Features | Classes |
MNIST | 60,000 | 10,000 | 784 | 10 |
FMNIST | 60,000 | 10,000 | 784 | 10 |
CICIDS2017 | 60,000 | 10,000 | 256 | 10 |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
p=10 | 91.44% | 86.57% | 87.90% | 90.26% | 86.51% | |
MNIST | p=100 | 88.96% | 78.13% | 77.61% | 76.20% | 77.16% |
p=1000 | 73.40% | 44.21% | 44.14% | 47.13% | 41.98% | |
p=10 | 80.33% | 77.60% | 78.31% | 79.45% | 78.00% | |
FMNIST | p=100 | 65.56% | 68.47% | 68.34% | 66.69% | 67.82% |
p=1000 | 66.65% | 53.05% | 49.99% | 47.67% | 53.16% | |
p=10 | 81.20% | 69.27% | 69.59% | 71.94% | 69.38% | |
CICIDS2017 | p=100 | 58.59% | 49.15% | 51.16% | 50.28% | 49.18% |
p=1000 | 48.77% | 25.48% | 26.47% | 25.39% | 28.59% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
D−>0 | 81.34% | 74.17% | 75.70% | 78.61% | 72.65% | |
MNIST | D=0.1 | 84.12% | 77.40% | 77.70% | 76.13% | 76.80% |
D−>∞ | 90.26% | 76.08% | 77.56% | 78.54% | 77.42% | |
D−>0 | 64.18% | 62.46% | 62.90% | 64.81% | 60.58% | |
FMNIST | D=0.1 | 68.23% | 66.60% | 68.28% | 67.15% | 65.88% |
D−>∞ | 80.38% | 66.55% | 66.43% | 66.83% | 66.85% | |
D−>0 | 47.99% | 31.70% | 46.35% | 47.46% | 31.77% | |
CICIDS2017 | D=0.1 | 51.70% | 43.84% | 48.70% | 49.14% | 42.39% |
D−>∞ | 80.52% | 56.53% | 56.55% | 56.64% | 56.05% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
C=0.2 | 83.39% | 76.06% | 75.30% | 76.70% | 76.70% | |
MNIST | C=0.6 | 86.17% | 76.01% | 76.92% | 77.15% | 76.57% |
C=1.0 | 86.68% | 77.72% | 77.48% | 76.71% | 77.85% | |
C=0.2 | 73.29% | 65.03% | 64.63% | 66.20% | 65.65% | |
FMNIST | C=0.6 | 71.81% | 67.27% | 67.31% | 67.16% | 67.31% |
C=1.0 | 71.84% | 67.55% | 65.91% | 66.19% | 67.06% | |
C=0.2 | 65.18% | 45.16% | 48.24% | 51.25% | 43.99% | |
CICIDS2017 | C=0.6 | 66.93% | 48.73% | 51.49% | 51.99% | 46.38% |
C=1.0 | 68.14% | 46.96% | 51.20% | 50.32% | 49.76% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
E=1 | 86.07% | 77.12% | 76.47% | 75.63% | 78.05% | |
MNIST | E=10 | 92.31% | 87.50% | 87.66% | 89.99% | 87.34% |
E=20 | 93.46% | 89.01% | 89.39% | 91.44% | 88.93% | |
E=1 | 77.15% | 67.63% | 66.45% | 67.37% | 67.08% | |
FMNIST | E=10 | 80.85% | 78.79% | 79.09% | 80.28% | 78.38% |
E=20 | 82.58% | 80.08% | 80.56% | 81.83% | 79.91% | |
E=1 | 67.77% | 50.71% | 50.80% | 50.92% | 50.52% | |
CICIDS2017 | E=10 | 78.33% | 68.88% | 69.29% | 70.80% | 68.92% |
E=20 | 80.05% | 73.78% | 74.06% | 74.86% | 73.36% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
Z=0 | 87.26% | 78.61% | 77.80% | 77.70% | 78.28% | |
MNIST | Z=1 | 81.82% | 73.31% | 71.17% | 72.39% | 74.82% |
Z=5 | 64.74% | 56.56% | 55.56% | 53.39% | 49.91% | |
Z=0 | 74.08% | 68.15% | 67.85% | 67.12% | 67.53% | |
FMNIST | Z=1 | 72.50% | 61.67% | 61.74% | 61.03% | 62.38% |
Z=5 | 44.10% | 37.30% | 31.67% | 33.11% | 29.58% | |
Z=0 | 68.83% | 52.79% | 49.97% | 46.94% | 53.48% | |
CICIDS2017 | Z=1 | 53.94% | 44.24% | 45.35% | 43.71% | 48.95% |
Z=2 | 45.35% | 38.93% | 37.67% | 35.02% | 39.90% |
Mnist | Fmnist | CICIDS2017 | |
FedSC | 202 s | 202 s | 206 s |
FedAvg | 205 s | 201 s | 210 s |
FedNova | 195 s | 193 s | 206 s |
Scaffold | 264 s | 268 s | 272 s |
FedProx | 378 s | 368 s | 412 s |
Mnist | Fmnist | CICIDS2017 | |
FedSC | 1.21 M | 1.21 M | 0.41 M |
FedAvg | 1.21 M | 1.21 M | 0.41 M |
FedNova | 1.21 M | 1.21 M | 0.41 M |
Scaffold | 2.42 M | 2.42 M | 0.82 M |
FedProx | 1.21 M | 1.21 M | 0.41 M |
Datasets | Training instances | Test instance | Features | Classes |
MNIST | 60,000 | 10,000 | 784 | 10 |
FMNIST | 60,000 | 10,000 | 784 | 10 |
CICIDS2017 | 60,000 | 10,000 | 256 | 10 |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
p=10 | 91.44% | 86.57% | 87.90% | 90.26% | 86.51% | |
MNIST | p=100 | 88.96% | 78.13% | 77.61% | 76.20% | 77.16% |
p=1000 | 73.40% | 44.21% | 44.14% | 47.13% | 41.98% | |
p=10 | 80.33% | 77.60% | 78.31% | 79.45% | 78.00% | |
FMNIST | p=100 | 65.56% | 68.47% | 68.34% | 66.69% | 67.82% |
p=1000 | 66.65% | 53.05% | 49.99% | 47.67% | 53.16% | |
p=10 | 81.20% | 69.27% | 69.59% | 71.94% | 69.38% | |
CICIDS2017 | p=100 | 58.59% | 49.15% | 51.16% | 50.28% | 49.18% |
p=1000 | 48.77% | 25.48% | 26.47% | 25.39% | 28.59% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
D−>0 | 81.34% | 74.17% | 75.70% | 78.61% | 72.65% | |
MNIST | D=0.1 | 84.12% | 77.40% | 77.70% | 76.13% | 76.80% |
D−>∞ | 90.26% | 76.08% | 77.56% | 78.54% | 77.42% | |
D−>0 | 64.18% | 62.46% | 62.90% | 64.81% | 60.58% | |
FMNIST | D=0.1 | 68.23% | 66.60% | 68.28% | 67.15% | 65.88% |
D−>∞ | 80.38% | 66.55% | 66.43% | 66.83% | 66.85% | |
D−>0 | 47.99% | 31.70% | 46.35% | 47.46% | 31.77% | |
CICIDS2017 | D=0.1 | 51.70% | 43.84% | 48.70% | 49.14% | 42.39% |
D−>∞ | 80.52% | 56.53% | 56.55% | 56.64% | 56.05% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
C=0.2 | 83.39% | 76.06% | 75.30% | 76.70% | 76.70% | |
MNIST | C=0.6 | 86.17% | 76.01% | 76.92% | 77.15% | 76.57% |
C=1.0 | 86.68% | 77.72% | 77.48% | 76.71% | 77.85% | |
C=0.2 | 73.29% | 65.03% | 64.63% | 66.20% | 65.65% | |
FMNIST | C=0.6 | 71.81% | 67.27% | 67.31% | 67.16% | 67.31% |
C=1.0 | 71.84% | 67.55% | 65.91% | 66.19% | 67.06% | |
C=0.2 | 65.18% | 45.16% | 48.24% | 51.25% | 43.99% | |
CICIDS2017 | C=0.6 | 66.93% | 48.73% | 51.49% | 51.99% | 46.38% |
C=1.0 | 68.14% | 46.96% | 51.20% | 50.32% | 49.76% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
E=1 | 86.07% | 77.12% | 76.47% | 75.63% | 78.05% | |
MNIST | E=10 | 92.31% | 87.50% | 87.66% | 89.99% | 87.34% |
E=20 | 93.46% | 89.01% | 89.39% | 91.44% | 88.93% | |
E=1 | 77.15% | 67.63% | 66.45% | 67.37% | 67.08% | |
FMNIST | E=10 | 80.85% | 78.79% | 79.09% | 80.28% | 78.38% |
E=20 | 82.58% | 80.08% | 80.56% | 81.83% | 79.91% | |
E=1 | 67.77% | 50.71% | 50.80% | 50.92% | 50.52% | |
CICIDS2017 | E=10 | 78.33% | 68.88% | 69.29% | 70.80% | 68.92% |
E=20 | 80.05% | 73.78% | 74.06% | 74.86% | 73.36% |
Datasets | Setting | FedSC | FedAvg | FedNova | Scaffold | FedProx |
Z=0 | 87.26% | 78.61% | 77.80% | 77.70% | 78.28% | |
MNIST | Z=1 | 81.82% | 73.31% | 71.17% | 72.39% | 74.82% |
Z=5 | 64.74% | 56.56% | 55.56% | 53.39% | 49.91% | |
Z=0 | 74.08% | 68.15% | 67.85% | 67.12% | 67.53% | |
FMNIST | Z=1 | 72.50% | 61.67% | 61.74% | 61.03% | 62.38% |
Z=5 | 44.10% | 37.30% | 31.67% | 33.11% | 29.58% | |
Z=0 | 68.83% | 52.79% | 49.97% | 46.94% | 53.48% | |
CICIDS2017 | Z=1 | 53.94% | 44.24% | 45.35% | 43.71% | 48.95% |
Z=2 | 45.35% | 38.93% | 37.67% | 35.02% | 39.90% |
Mnist | Fmnist | CICIDS2017 | |
FedSC | 202 s | 202 s | 206 s |
FedAvg | 205 s | 201 s | 210 s |
FedNova | 195 s | 193 s | 206 s |
Scaffold | 264 s | 268 s | 272 s |
FedProx | 378 s | 368 s | 412 s |
Mnist | Fmnist | CICIDS2017 | |
FedSC | 1.21 M | 1.21 M | 0.41 M |
FedAvg | 1.21 M | 1.21 M | 0.41 M |
FedNova | 1.21 M | 1.21 M | 0.41 M |
Scaffold | 2.42 M | 2.42 M | 0.82 M |
FedProx | 1.21 M | 1.21 M | 0.41 M |