Loading [MathJax]/jax/output/SVG/jax.js
Research article

One step proximal point schemes for monotone vector field inclusion problems

  • Received: 09 November 2021 Revised: 09 January 2022 Accepted: 17 January 2022 Published: 11 February 2022
  • MSC : 47H05, 47J25, 49J40, 65K10, 65K15

  • In this paper, we propose one step convex combination of proximal point algorithms for countable collection of monotone vector fields in CAT(0) spaces. We establish Δ-convergence and strong convergence theorems for approximating a common solution of a countable family of monotone vector field inclusion problems. Furthermore, we apply our methods to solve a family of minimization problems, compute Frechét mean and geometric median in CAT(0) spaces, and solve a kinematic problem in robotic motion control. Finally, we give a numerical example to show the efficiency and robustness of the proposed scheme in comparison to a known scheme in the literature.

    Citation: Sani Salisu, Poom Kumam, Songpon Sriwongsa. One step proximal point schemes for monotone vector field inclusion problems[J]. AIMS Mathematics, 2022, 7(5): 7385-7402. doi: 10.3934/math.2022412

    Related Papers:

    [1] Lihe Liang, Jinying Cui, Juanjuan Zhao, Yan Qiang, Qianqian Yang . Ultra-short-term forecasting model of power load based on fusion of power spectral density and Morlet wavelet. Mathematical Biosciences and Engineering, 2024, 21(2): 3391-3421. doi: 10.3934/mbe.2024150
    [2] Xiaoqiang Dai, Kuicheng Sheng, Fangzhou Shu . Ship power load forecasting based on PSO-SVM. Mathematical Biosciences and Engineering, 2022, 19(5): 4547-4567. doi: 10.3934/mbe.2022210
    [3] Faisal Mehmood Butt, Lal Hussain, Anzar Mahmood, Kashif Javed Lone . Artificial Intelligence based accurately load forecasting system to forecast short and medium-term load demands. Mathematical Biosciences and Engineering, 2021, 18(1): 400-425. doi: 10.3934/mbe.2021022
    [4] Mingju Chen, Fuhong Qiu, Xingzhong Xiong, Zhengwei Chang, Yang Wei, Jie Wu . BILSTM-SimAM: An improved algorithm for short-term electric load forecasting based on multi-feature. Mathematical Biosciences and Engineering, 2024, 21(2): 2323-2343. doi: 10.3934/mbe.2024102
    [5] Yongquan Zhou, Yanbiao Niu, Qifang Luo, Ming Jiang . Teaching learning-based whale optimization algorithm for multi-layer perceptron neural network training. Mathematical Biosciences and Engineering, 2020, 17(5): 5987-6025. doi: 10.3934/mbe.2020319
    [6] Yanmei Jiang, Mingsheng Liu, Jianhua Li, Jingyi Zhang . Reinforced MCTS for non-intrusive online load identification based on cognitive green computing in smart grid. Mathematical Biosciences and Engineering, 2022, 19(11): 11595-11627. doi: 10.3934/mbe.2022540
    [7] Fengyong Li, Meng Sun . EMLP: short-term gas load forecasting based on ensemble multilayer perceptron with adaptive weight correction. Mathematical Biosciences and Engineering, 2021, 18(2): 1590-1608. doi: 10.3934/mbe.2021082
    [8] Chun Li, Ying Chen, Zhijin Zhao . Frequency hopping signal detection based on optimized generalized S transform and ResNet. Mathematical Biosciences and Engineering, 2023, 20(7): 12843-12863. doi: 10.3934/mbe.2023573
    [9] Hao Yuan, Qiang Chen, Hongbing Li, Die Zeng, Tianwen Wu, Yuning Wang, Wei Zhang . Improved beluga whale optimization algorithm based cluster routing in wireless sensor networks. Mathematical Biosciences and Engineering, 2024, 21(3): 4587-4625. doi: 10.3934/mbe.2024202
    [10] Chongyi Tian, Longlong Lin, Yi Yan, Ruiqi Wang, Fan Wang, Qingqing Chi . Photovoltaic power prediction based on dilated causal convolutional network and stacked LSTM. Mathematical Biosciences and Engineering, 2024, 21(1): 1167-1185. doi: 10.3934/mbe.2024049
  • In this paper, we propose one step convex combination of proximal point algorithms for countable collection of monotone vector fields in CAT(0) spaces. We establish Δ-convergence and strong convergence theorems for approximating a common solution of a countable family of monotone vector field inclusion problems. Furthermore, we apply our methods to solve a family of minimization problems, compute Frechét mean and geometric median in CAT(0) spaces, and solve a kinematic problem in robotic motion control. Finally, we give a numerical example to show the efficiency and robustness of the proposed scheme in comparison to a known scheme in the literature.



    Power load forecasting can be divided into long-term forecasting, medium-term forecasting, short-term forecasting and ultra-short-term forecasting according to the forecasting time-scale. The forecasting period of short-term power load is typical, as it is a critical basis for maintaining the stable operation of the power system and improving economic benefits. The accuracy of the short-term power forecast can play an important role in addressing the issue of the power decision department controlling power dispatch in the next step. Accurate short-term load forecasting can effectively reduce resource waste and improve economic benefits [1,2,3].

    At present, load prediction methods primarily include a statistical prediction method composed of multiple linear regression [4], a Kalman filter [5,6] an autoregressive moving average and a machine learning method composed of a support vector machine [7,8,9], an expert system and artificial neural networks [10]. Research has consistently shown that the calculation model of the statistical method is too ordinary, as it can only deal with linear data but cannot grasp the inherent characteristics of nonlinear data reasonably. Although the machine learning method can deal with nonlinear data well, it cannot extract time-series data features effectively. With the development of deep learning, it becomes the focus of load forecasting. A large number of deep neural networks are widely employed in load prediction, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs) [11] and short and long duration memory networks (LSTM) [12]. CNNs can effectively extract multidimensional data features, but it cannot deal with time-series features efficiently. RNNs can model long time-series data through a cyclic structure, but with the increase of load data, there are problems such as gradient disappearance or gradient explosion. As a special RNN, the LSTM network can better solve the deficiency of the RNN through the use of a gate structure. Nonetheless, with the increase in training data, it is difficult to select parameters for the LSTM network [13]. In order to effectively process multidimensional power load data, the CNN-LSTM hybrid neural network prediction method was proposed in the literature [14]. Feature extraction was carried out through a two-dimensional convolutional layer to reduce the training difficulty of the LSTM network model. Surveys such as that conducted by the authors of [15] showed that using the CNN to extract data features, using the gated recurrent neural (GRU) network to avoid the problem of multiple training parameters in the LSTM network and introducing an attention mechanism, can effectively improve the accuracy of power load forecasting. Reference [16] found that their CNN-BiGRU network improves data utilization in order to make data flow bidirectional in the network layer. According to the research, since CNN networks cannot predict time series data well, time series convolutional networks (TCNs) can be employed for sequence data prediction. And TCN can extract time series data features better than CNN and RNN [17]. Tian et al. [18] proposed a short-term wind speed prediction model employing empirical modal decomposition and an improved sparrow algorithm to optimize the LSTM neural network. The model decomposes the ultra-short-term wind speed by utilizing empirical modal decomposition, predicts it by employing the LSTM network and optimizes the LSTM network hyperparameters by improving the sparrow optimization algorithm. In [19] for short-term wind speed prediction, a prediction model based on local mean decomposition (LMD) with a combined kernel function least squares support vector machine (LSSVM) is proposed. Wind speed data are decomposed by the LMD algorithm and predicted by the LSSVM, and the firefly algorithm is employed to optimize the parameter selection. The authors of [20] proposed a time-series convolutional network with the multi-attentional mechanism. By introducing an initial structure into the TCN network, multidimensional information was extracted from convolutional kernels of disagreement scales, improving the accuracy of ultra-short-term load prediction effectively. The authors of [21] proposed a combined prediction model based on empirical modal decomposition to forecast traffic flow state information. The empirical modal decomposition is decomposed into components, the optimal prediction method is selected based on the results of adaptive analysis and the combined model weights are optimized by employing the fruit fly algorithm. The authors of [22] proposed a combined prediction model based on ensemble empirical modal decomposition and a regularized limit learning machine for wind speed prediction. The wind speed series of the ensemble empirical modal decomposition is predicted by employing the regularized limit learning machine, and the reliability of the prediction model is improved by cross-validation. Recent evidence suggests that complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and variational model decomposition (VMD) can be employed to decompose power load data. Second, non-stationary components and stationary components are predicted by deep bidirectional long short-term memory (DBILSTM) and mixed logistic regression (MLR) networks respectively. Finally, the prediction accuracy is improved by combining the prediction structure with data reconstruction [23].

    This paper presents a multi-model fusion method for short-term power load forecasting based on VMD, improved whale optimization (WOA), Wavelet temporal convolutional network (WTCN)-BiGRU and CatBoost methods. First, VMD was employed to decompose power load data into contrast intrinsic mode functions(IMFs), and weather characteristic factors were added for each intrinsic mode. Then, the TCN was utilized to extract multidimensional data features, and the extracted features were sent to the BiGRU network for model training. The influence degree of necessary information can be effectively retained by adding an attention mechanism. The model adopts the improved WOA (IWOA) algorithm to optimize the hyperparameter selection of the TCN-BiGRU-attention network, which designs the parameter selection of the network layer of the model more effectively and predicts the stationary component of the sequence in parallel with the CatBoost network. The parallel prediction results of two-layer networks were combined with the mean absolute percentage error-reciprocal weight (MAPE-RW) algorithm. The model accuracy was verified by utilizing commenced loading data from an Australian region, and the model performance was evaluated using the root mean square error (RMSE) and mean absolute percentage error (MAPE). The contributions of this paper can be summarized as follows:

    1) A multi-model fusion load forecasting method is proposed. First, the IWOA-WTCN-BiGRU-attention prediction model of VMD was constructed as Model one. Second, the CatBoost prediction model based on a random search algorithm was adopted as Model two. Finally, the MAPE-RW algorithm has been utilized to fuse the load prediction results of the two models to achieve an accurate prediction of the power load value.

    2) The Morlet wavelet function is used to improve the TCN. The Morlet wavelet basis function is introduced as residual block activation function of the TCN.

    3) An improved WOA is proposed. The traditional WOA is improved by introducing a nonlinear convergence factor and adaptive weight, which improves the convergence speed and the convergence accuracy of the algorithm.

    The paper has been organized in the following way. In Section 2, the network principles employed in the multi-model structure are introduced. In Section 3, the load prediction model structure of the multi-model fusion network is proposed. In Section 4, the experimental validation of the multi-model fusion network is performed and the experimental results are analyzed. Finally, the whole paper is summarized and future work to be carried out is presented.

    As mentioned in the literature [24], in 2014, Konstantin et al. proposed a VMD method for modal decomposition, which is an adaptive and completely non-recursive modal decomposition processing method. VMD has a more solid theoretical basis and can better suppress mode aliasing by controlling bandwidth. The decomposition method is suitable for non-stationary sequence data and can decompose the data set into multiple stationary sub-sequences with different frequency scales. The VMD solution procedure is as follows.

    Structural constraint variational optimal problem is

    {min{uk},{wk}{kk=1t[(d(t)+jpt)guk(t)]ejwkt22}s.t.Kk=1uk(t)=S (2.1)

    where {uk},{ωk} denote the corresponding modal set and center frequency after VMD decomposition respectively, and K is the number of IMFs.

    The penalty factor α and Lagrange operator λ are introduced into the constrained variational problem to transform it into the following unconstrained variational problem:

    L(uk,ωk,λ)=αKk=1t[(δ(t)+jπt)uk(t)]ejωkt22+f(t)Kk=1uk(t)22+λ(t),f(t)Kk=1uk(t) (2.2)

    The above unconstrained variational problems are solved by the alternating direction multiplier method, and the solving process is as follows:

    Kk=1ˆun+1kˆunk22/Kk=1ˆun+1kˆunk22ˆunk22ˆunk22<γ (2.3)

    where γ is the allowable error, n is the number of iterations and the Fourier transform of ˆun+1k,f(t),λ(t) is ˆun+1k(ω),ˆf(ω),ˆλn(ω).

    The TCN was first proposed by Bai et al. [25]. in 2018 and is mainly employed for timing prediction, probability prediction, time prediction and traffic prediction. The TCN evolved from CNN results and can extract load data features effectively. In this paper, a multi-model fusion network is introduced, and a TCN is employed to extract data feature information from time series to remove invalid features to improve the accuracy of power load prediction. The TCN is composed of causal convolution, expansion convolution and a residual block [26].

    Causal convolution adopts the one-dimensional full convolutional network framework. The zero-fill module is introduced into the network so that the input layer, hidden layer and output layer can keep the same length, to avoid the loss of effective information. The input yt is related only to the input (xt1,xt2,xtn) before the current input xt and t. The convolutional calculation is shown in Figure 1.

    Figure 1.  Convolutional calculation process.

    Expansion convolution can increase the receptive field size of the output unit without increasing the number of parameters. The convolutional calculation method is as follows:

    F(s)=(xfd)(s)=k1i=0f(i)xtdi (2.4)

    where fd is the expansion rate, d corresponds to the filter and xtdi is the input at the current time and the historical time.

    The core idea of residual block is to introduce one or more layers of "hop connection" operation, and the network structure is shown in Figure 2. The left channel introduces weighted normalized accelerated gradient descent and a nonlinear activation function. The right channel is the convolution directly connected to the edge, which ensures that the input and output data dimensions are consistent. The residual block output is

    h(x)=Activation(x+F(x)) (2.5)
    Figure 2.  Structure of residual block.

    where x, h(x) is the input and output of the residual block. The network output h(x) is the result of linear transformation and activation function mapping.The WTCN is based on TCN topology, and the Morlet wavelet basis function is introduced into the residual block as its activation function. The Morlet wavelet basis function is expressed as

    y=cos(1.75x)ex2/x222 (2.6)

    The GRU network is simpler than the LSTM network with two gating units. The network inherits the advantages of the LSTM network and improves the training speed on the premise of ensuring training accuracy [27]. The GRU network structure is shown in Figure 3. By changing the GRU network into a bidirectional GRU network, information can be transmitted bidirectionally in the network layer, and the prediction accuracy of the network model is effectively improved [28]. The network structure is shown in Figure 4.

    Figure 3.  GRU network structure.
    Figure 4.  BiGRU network model.

    The BiGRU network calculation formula is

    {zt=σ(Wz[ht1,xt])rt=σ(Wr[ht1,xt])˜ht=tanh(W˜h[rtht1,xt])ht=(1zt)ht1+zt˜ht (2.7)
    {ht=GRU(xt,ht1)hi=GRU(xt,ht1)ht=wtht+vthi+bt (2.8)

    where zt is the update gate and rt is the reset gate, both of which are jointly determined by the input xt, hidden layer output xt at the previous moment and activation function σ. ht is the hidden layer output. Wz,Wr and W˜h are all trainable parameter matrices. ht is the state of the forward hidden layer, hi is the state of the backward hidden layer and bt is the bias optimization parameter of the hidden layer at the current time.

    The attention mechanism is an intuitive interpretation method that imitates human visual mechanisms. It is often exploited in deep learning tasks such as natural language processing, image analysis and load prediction. The human visual mechanism will pay attention to the critical information of the object deliberately and ignore the irrelevant information. Consequently, it has been found that the relevant time-series information can be effectively preserved by adding an attention mechanism and weight allocation principle in the network model [29]. The structure of attention is shown in Figure 5.

    Figure 5.  Attention mechanism.

    The WOA is a bionic meta-heuristic optimization algorithm proposed by Australian scholars Mirjalili and Lewis in recent years based on the predation behavior of model humpback whales. The algorithm highlights the local search behavior of the network model by imitating the whale hunting behavior and realizes the global search of the network through the random search strategy. The WOA has the advantages of faster speed and higher precision in solving model parameter optimization, so it has wide application prospects. Nevertheless, the increase of power load data and influencing factors may cause the traditional WOA to have some limitations in the coordination of global search and local mining. Among them, the convergence factor a of the WOA cannot reflect the optimization process well with the linear decrease. Therefore, the nonlinear convergence factor a is proposed:

    a=22sin(utmax_iterπ+φ) (2.9)

    where u and φ are the set parameters and u=2 and φ=0 represent. max_iter is the maximum number of iterations. When the value is large at the initial stage of training, the searching range of optimal parameters can be effectively increased by the slowly decreasing convergence factor. With the increase in the number of iterations, the reduction speed of the convergence factor gradually increases and the convergence speed accelerates.

    The introduction of the nonlinear factor a can improve the performance of the algorithm. However, in the traditional WOA, the whale motion position vector is not effectively utilized, so the population flexibility will be reduced and the optimization result will be affected. In this paper, the adaptive weight ω is introduced to enhance the global search capability of the algorithm and increase the total group diversity of the WOA. The formula for calculating ω is as follows:

    ω=0.2cos(π2(1tmax_iter)) (2.10)

    In this paper, the performance of the IWOA is verified by introducing the benchmark test function f(x)=ni=1[x2i10cos(2πxi)+10]. The number of iterations of the algorithm was set to 500 and the dimension of the base test function is 30. To ensure the reliability of the optimization results, the average of 10 experimental results is employed to indicate its average level. The WOA, improved nonlinear convergence factor (NWOA) and IWOA with adaptive weights and a nonlinear convergence factor are compared for algorithm performance. The experimental results are shown in Figure 6. It is known that the WOA with improved adaptive weights and a nonlinear convergence factor (IWOA) not only improves the convergence speed of the algorithm, but also improves the convergence accuracy. The flow chart of the improved WOA is shown in Figure 7.

    Figure 6.  WOA convergence curve was tested by using reference functions.
    Figure 7.  Flow chart of WOA.

    CatBoost is a machine learning library that the Russian search giant Yandex opened source in 2017, and it is an improvement on the gradient boosting decision tree (GBDT) algorithm [30]. The CatBoost algorithm has fewer parameters than the GBDT algorithm. The algorithm effectively solves the problems of gradient deviation and prediction deviation, reduces the risk of model overfitting and improves the generalization ability of the algorithm. CatBoost algorithms are often used in data mining and load forecasting.

    Noise and low-frequency data interference can be effectively reduced by adding prior distribution terms to the gradient decision tree. Its algorithm is as follows:

    ˆxik=p1j[xσj,k=xσi,k]Yσj+app1j[xσj,k=xσi,k]+a (2.11)

    where σ represents the weight coefficient of the prior term, and p represents the prior term. The CatBoost usually captures the mean of the data set as the first item when solving regression problems.

    The MAPE-RW algorithm can fuse disagreement models according to the degree of prediction error and output the optimal prediction results. The proportion of the predicted value of each model was determined by finding the optimal weight. The final predicted value calculated by the algorithm is as follows:

    {ωi=MjMi+Mjffinal=ωVTWBAfCatboost+ωCatboostfVTWBA (2.12)

    where ωi is the corresponding model weight, and ffinal is the final prediction output of multi-model fusion. fCatboost and fVTWBA are the predicted outputs of the CatBoost model and VMD-decomposed WTCN-IWOA-BiGRU-attention model, respectively.

    There is a lot of power load influencing factors in time-series data, so the traditional prediction model cannot extract the data feature law effectively. In this paper, a multi-model fusion short-term power load forecasting model is proposed by combining a deep learning algorithm and a machine learning algorithm. Combined with the advantages of different algorithms, the characteristic information between data can be effectively mined to improve the accuracy of load prediction. The prediction model design of the multi-model fusion network is shown in Figure 8.

    Figure 8.  Multi-model fusion network load forecasting model.

    1) Data processing. The model validation analysis is carried out by employing a public power load data set from 2006 to 2010 in an Australian region. This data set contains six-dimensional feature vectors, and the feature parameters are shown in Table 1 below. The data set captures 30 min as the sampling point, and load prediction is carried out by using a sliding window, with a sliding window size of 10 and sliding step size of 1. Therefore, 10 sets of historical data are employed to predict the electric load value at the next moment. The data set is divided into the training set, verification set and test set according to 3:1:1. The multi-fusion network model employs the verification set to conduct parameter tuning in the training process. In order to make the network model evaluation more accurate, the test set will not participate in the network model training.

    Table 1.  Characteristic parameters.
    Characteristic parameters Parameter types Character descriptions
    Data Time Samples were taken every 30 minutes
    Weather factors Dew-point humidity Equilibrium temperature
    Dry-bulb humidity Aerothermodynamic temperature
    Wet-bulb temperature Thermodynamic saturation temperature
    Humidity Degree of atmospheric dryness
    Economic factors Degree Price per kWh

     | Show Table
    DownLoad: CSV

    2) Input layer. Characteristic data and power load data are exploited as input for the prediction model. The input data with length n is filled with missing values and normalized into the prediction model.

    3) VMD layer. The power load data are employed as the input of the prediction model. The long time-series data were input into the prediction Model one after missing-value filling and normalization. In the VMD, the values of k and alpha are determined by the central frequencies in the decomposition. The value of the central frequency is calculated by changing the values of k and alpha. By choosing a reasonable value of k, the phenomenon of model mixing can be avoided, and fewer network parameters for the WOA-based model one can also be generated. As can be seen from Table 2, at k = 5 and alpha = 1850, the central frequency has been relatively stable with the least number of decomposition layers, which makes further training produce fewer parameters and improves the model training speed. The penalty factor of decomposition of the variational model alpha=1850, the tolerance difference of collection tol=1e7 and the number of decomposition modes k=5 are set. The decomposition of each mode is shown in Figure 9.

    Table 2.  Center frequencies corresponding to different values of k.
    k Simulated signal center frequency
    2 0.0001 0.02243
    3 0.0001 0.02080 0.04274
    4 0.0001 0.02076 0.04167 0.06471
    5 0.0001 0.02076 0.04162 0.06469 0.34702
    6 0.0001 0.02076 0.04162 0.06238 0.08467 0.35872

     | Show Table
    DownLoad: CSV
    Figure 9.  Results of VMD.

    4) IWOA-based hyperparameter optimization. The optimal hyperparameters are obtained by employing WOA, and in order to perform the optimization search within the range of valid parameter selections, the range of network parameter values selected has been defined as shown in Table 3. The components and weather characteristics generated from the decomposition of load information by VMD are the input of the WTCN-BiGRU-attention network, respectively. And, the IWOA is employed to optimize the network hyperparameters. The optimal network hyperparameter search structure of each component is shown in Table 4.

    Table 3.  Parameters of IWOA.
    Parameters Parameter values
    Parameter settings Population size 5
    Max iterations 5
    Constant b 2
    Search for upper and lower bounds Learn rate [0.001, 0.01]
    Epoch [10,100]
    Batchsize [16,128]
    BiGRU node number [1,20]
    Number of nodes at the full [1,100]

     | Show Table
    DownLoad: CSV
    Table 4.  Optimal parameter selection for power load forecasting.
    Modal tags IMF1 IMF2 IMF3 IMF4 IMF5
    Learn rate 0.0047 0.0027 0.0064 0.0054 0.0024
    Epoch 38 26 61 70 89
    Batchsize 72 19 72 37 22
    BiGRU node number 8 18 8 17 2
    BiGRU node number 16 11 7 17 19
    Number of nodes at the full 49 4 46 14 97

     | Show Table
    DownLoad: CSV

    5) WTCN layer. The influential factors of the load characteristics were added to the modes decomposed by the VMD layer, respectively. The Morlet wavelet function is used as a residual block activation function. The network extracts load characteristics and influencing factors through the WTCN layer. It normalizes the weight of the convolutional kernel. The dropout coefficient can be set to 0.2 to prevent over-fitting of the model. We set the expansion coefficient as (1, 2, 4, 8, 16, 32). We set the number of filters to 128.

    6) BiGRU layer. The model builds two BiGRU layers to learn the features extracted from the WTCN, design full utilization of the data features and capture its internal change rules.

    7) Attention layer. The input of the attention mechanism is the output data activated by the two-layer BiGRU network. The corresponding proportions of disagreement feature vectors are calculated according to the weight allocation principle, and the optimal weight parameter matrix is searched by using continuous updates and iteration.

    8) CatBoost prediction model. A random search algorithm is employed to select the CatBoost network hyperparameters. The optimal network hyperparameters are shown in Table 5. The input power load and weather characteristic factors are modeled.

    Table 5.  Power load forecasting for CatBoost network hyperparameters.
    Verbose Learning rate Iterations Depth
    50 0.03 900 12

     | Show Table
    DownLoad: CSV

    9) Output layer. The IWOA-WTCN-BiGRU-attention network is set as Model one, and the CatBoost network is set as Model two. The MAPE-RW algorithm was exploited to calculate the weight of the output results of Model one and Model two. Finally, the load prediction output of the multi-model fusion network is obtained by effective fusion of the model prediction results.

    Adam's optimization algorithm was selected as the parameter optimization method of network Model one. Adam is a first-order optimization algorithm that can effectively replace the traditional gradient descent process. The algorithm can update and iterate the weight of the network according to the data so that the loss function can be optimized. The loss function of the model is calculated by employing the mean square error, and its formula is

    MSE=1NNi=1(yiˆyi)2 (3.1)

    where N is the number of samples; yi and ˆyi are the actual load value and predicted load value of model i, respectively.

    The minimum-maximum normalization method is exploited to normalize the original data and increase the training speed of the model. The inverse normalization of the predicted data designs the comparison between the predicted value and the real value more intuitive. Its calculation formula is

    xn=xxminxmaxxmin (4.1)

    where x is the original load data. xmax and xmin are, respectively, the maximum value and minimum value of the sample data. xn is the normalized data.

    The RMSE, MAPE, mean absolute error (MAE) and R-square were utilized as evaluation indexes. The calculation formulas are as follows:

    {RMSE=1NNi=1(˜xixi)2MAPE=100NNi=1|˜xixiP0|MAE=1nni=1|˜xixi|Rsquare=1ni=1(˜xixi)2ni=1(˜xiˉxi)2  (4.2)

    where N is the number of samples. ˜xi is the true value of the sample point i. xi is the predicted value of the ith sample point.

    The prediction results of the proposed model were compared with those of the traditional single model or mixed deep learning models such as GRU, LSTM, TCN, WTCN, WTCN-GRU, WTCN-LSTM, TCN-BiGRU, WTCN-BiGRU and TCN-BiGRU-attention models. The results of the load forecast data for December 20, 2010, are plotted to show more visually the accuracy advantages of the power load forecast model proposed in this paper. The load forecasting curves are shown in Figure 10. Different curves represent the prediction results and trends of disagreement prediction models. As can be seen from the prediction trends of the disagreement models shown in Figure 10, the prediction results of the prediction model proposed in this paper are more accurate, stable and closer to the real load. Table 6 shows the test lumped prediction and evaluation indexes of each model.

    Figure 10.  Results of the load forecast data for December 20, 2010.
    Table 6.  Total evaluation index results of the electricity load forecasting models.
    Prediction model RMSE MAPE MAE R-square
    IWOA-WTCN-BiGRU-attention 110.964 0.914 82.189 0.993
    CatBoost 89.089 0.727 63.407 0.996
    Fusion model of this paper 77.495 0.632 56.103 0.997
    WTCN-BiGRU-attention 103.452 0.86 76.179 0.994
    WTCN-BiGRU 97.573 0.803 69.903 0.995
    WTCN-GRU 96.584 0.814 70.868 0.995
    WTCN-LSTM 100.309 0.843 73.649 0.995
    WTCN 103.558 0.869 75.975 0.994
    TCN 103.969 0.89 78.024 0.994
    LSTM 147.665 1.261 108.342 0.988
    GRU 109.991 0.932 82.114 0.994
    CNN 337.913 3.241 275.52 0.939

     | Show Table
    DownLoad: CSV

    Therefore, the forecasting effects of different models were evaluated by analyzing the effects of the forecasting data of different models every month. Table 7 shows the error values of the monthly forecasting results of the different models. The analysis of the model evaluation metrics shows that the smaller the error values of MAPE, MAE and RMSE, the better the forecasting performance of the models. The larger the R-square value, the closer the predicted value is to the real value. After analyzing the data in Table 6 and Figure 10, the following conclusions can be drawn:

    Table 7.  Each model predicts the evaluation index.
    RMSE Model one Model two Fusion model WT-BG-a WT-BG WT-G WT-L WT
    Jan. 108.372 106.039 97.048 126.77 116.382 96.321 107.057 121.456
    Feb. 102.257 84.791 76.02 108.625 89.625 90.072 88.652 103.947
    Mar. 85.485 72.585 61.48 84.297 76.175 80.666 85.047 86.109
    Apr. 96.362 106.033 83.271 105.568 110.002 108.207 109.236 113.008
    May 121.924 95.097 80 101.731 101.456 101.066 100.091 108.432
    Jun. 141.496 86.97 82.599 100.258 91.622 93.661 98.118 104.279
    Jul. 131.409 86.787 77.983 101.809 87.982 92.821 95.962 100.363
    Aug. 170.315 84.826 102.867 105.915 103.802 105.615 112.917 107.68
    Sept. 127.524 83.933 81.581 104.565 99.724 99.005 101.172 100.071
    Oct. 85.224 89.977 65.933 95.965 93.855 99.785 101.756 97.116
    Nov. 79.742 84.105 62.097 99.21 95.347 101.174 106.564 100.065
    Dec. 79.927 82.657 63.727 103.142 101.035 92.024 101.432 101.644

     | Show Table
    DownLoad: CSV

    1) Compared with VMD-IWOA-WTCN-BiGRU-attention and CatBoost alone, the prediction results are more accurate by combining the proposed multiple models. The RMSE decreased by 33.469 and 11.594, MAPE decreased by 0.282 and 0.095% and MAE decreased by 26.086 and 7.304, respectively. By analyzing the reasons, it can be seen that VMD-IWOA-WTCN-BiGRU-attention has a large prediction deviation in load fluctuation hours because VMD caused a loss of part of the data. The CatBoost model is more accurate in the prediction of stationary components, but the prediction deviation is larger when the data fluctuation is larger. Therefore, the MAPE-RW algorithm is used to integrate the advantages of the two models to create a prediction effect that is more accurate.

    2) Compared with other independent prediction models, the prediction results of the model proposed in this paper are closer to the real value. Compared with the WTCN-BiGRU prediction model, the RMSE decreased by 20.078, MAPE decreased by 0.171% and MAE decreased by 13.4. It can be seen that the algorithm based on the bottom combination model also achieves good training results, but it has the disadvantage of low prediction accuracy.

    To verify the feasibility and accuracy of the model in different forecasting areas. We employed the 2018 annual measured power generation of the domestic Ningxia Wuzhong Sun Mountain photovoltaic (PV) power plant for PV power generation prediction, as well as five environmental data types, i.e., total solar irradiation, PV panel module temperature, ambient temperature, atmospheric pressure and relative humidity, measured by the environmental detector corresponding to this PV array. The data sets were collected at 15-minute intervals. Since the PV array only emits energy during the daytime, the valid data of the daily 7:30–16:30 PV-emitted power were selected as the model validation data. The data are divided according to the ratio of 10:1:1, and the first 10 months are taken as the training data, November has been applied as the validation data during training and December data as the test set. The prediction model parameters are set in the same way as the power load prediction model parameters. The WTCN-BiGRU-attention network hyperparameters were selected by the WOA algorithm for the PV power generation prediction model, as shown in Table 8, and the best network hyperparameters were selected by the random search algorithm for the CatBoost network, as shown in Table 9.

    Table 8.  Selection of hyperparameters for the PV power prediction network.
    Modal tags IMF1 IMF2 IMF3 IMF4 IMF5
    Learn rate 0.0038 0.0012 0.0059 0.0067 0.0030
    Epoch 72 54 50 73 96
    Batchsize 32 57 118 117 70
    BiGRU node number 5 1 16 10 3
    BiGRU node number 6 7 15 13 12
    Number of nodes at the full 44 12 54 84 12

     | Show Table
    DownLoad: CSV
    Table 9.  PV power forecasting CatBoost network hyperparameters.
    Verbose Learning rate Iterations Depth
    50 0.05 500 9

     | Show Table
    DownLoad: CSV

    The prediction results of the model are compared with the hybrid neural network models WTCN-BiGRU-attention, TCN-BiGRU-attention and CNN-BiGRU-attention. The daily power generation forecasting results of PV panels for two days are plotted to show more intuitively the advantages of the multi-model fusion forecasting network proposed in this paper. The accuracy of the load curve prediction results is shown in Figures 11 and 12, where different curves represent the prediction results and trends of different models. From the figures, it can be seen that the proposed multi-modal fusion forecasting network has higher accuracy. Table 10 shows the total prediction evaluation index of each model test set.

    Figure 11.  December 10, 2018 PV power forecast.
    Figure 12.  December 18, 2018 PV power forecast.
    Table 10.  PV power generation forecast model evaluation index.
    Prediction model RMSE MAPE MAE R-square
    Model one 1.996 24.380 1.190 0.964
    Model two 1.434 13.119 1.023 0.982
    Fusion model of this paper 1.285 12.285 0.924 0.985
    WTCN-BiGRU-attention 3.737 26.8 2.315 0.875
    TCN-BiGRU-attention 3.739 23.808 2.466 0.875
    CNN-BiGRU-attention 3.738 26.243 2.365 0.875

     | Show Table
    DownLoad: CSV

    To sum up, the multiple models proposed in this paper combined with the short-term power load prediction model have more outstanding prediction performance, and the prediction results are relatively more stable, meaning that the model can be better used to predict the power load data with multidimensional feature inputs.

    The major objective of this study was to build a forecasting model by integrating multiple models to improve the accuracy of power load forecasting. In Model one, decompose the data into multiple components by VMD decomposition. Then, an IWOA is exploited to optimize the super parameters of the WTCN-BiGRU-attention network model. At the same time, Model two is designed for the parallel prediction of multi-dimensional load data by the CatBoost algorithm. Finally, the MAPE-RW algorithm is employed to fuse the prediction results of the two models to achieve accurate and personal measurements of short-term power load data. Taking multidimensional load data of an area in Australia as a model example, the feasibility verification analysis was carried out, and the main conclusions are as follows:

    1) Based on the power load data of a certain region in Australia, we constructed a multi-dimensional power load feature set to better predict the non-stationary components with strong fluctuation of power load data, i.e., Model one.

    2) The stationary components of multi-dimensional power load data are predicted by Model two, and the model prediction results are fused by the MAPE-RW algorithm, which improve the power load prediction accuracy of multi-model fusion.

    To sum up, the hybrid neural network combined model for multidimensional characteristic power load data prediction has been proposed in this paper. This research sheds new light and not only provides reference and choices for short-term power load forecasting methods, but also has good reference significance for other power fields, such as wind power generation forecasting and energy storage unit service-life forecasting. However, the structure of the multi-model fusion network is overly complex, which increases the model prediction time and wastes computer resources while improving the accuracy of power load prediction. Therefore, in the future, the authors will work on designing a more concise interval prediction model for power loading with suitable accuracy. While overcoming the training time, the interval prediction makes the prediction results more meaningful for the power sector to conduct power dispatching and effectively avoid the waste of power resources.

    This work was supported by the State Grid Corporation of China Headquarters Science and Technology Project (5400-202122573A-0-5-SF). The authors thank the editors and the anonymous reviewers for their helpful comments and suggestions that have improved the presentation of this manuscript.

    The authors declare that there is no conflict of interest.



    [1] M. Bacák, Computing medians and means in Hadamard spaces, SIAM J. Optimiz., 24 (2014), 1542–1566. https://doi.org/10.1137/140953393 doi: 10.1137/140953393
    [2] M. Bacák, Old and new challenges in Hadamard spaces, arXiv preprint arXiv: 1807.01355, (2018), 1–33.
    [3] I. D. Berg, I. G. Nikolaev, Quasilinearization and curvature of Aleksandrov spaces, Geometriae Dedicata, 133 (2008), 195–218. https://doi.org/10.1007/s10711-008-9243-3 doi: 10.1007/s10711-008-9243-3
    [4] M. R. Bridson, A. Haefliger, Metric spaces of non-positive curvature, Springer Science & Business Media, 2013. https://doi.org/10.1007/978-3-662-12494-9
    [5] K. S. Brown, Buildings, Springer, 76–98, 1989. https://doi.org/10.1007/978-1-4612-1019-1_4
    [6] F. Bruhat, J. Tits, Groupes réductifs sur un corps local, Publications Mathématiques de l'Institut des Hautes Études Scientifiques, 41 (1972), 5–251. https://doi.org/10.1007/978-3-642-87942-5_3
    [7] P. Chaipunya, F. Kohsaka, P. Kumam, Monotone vector fields and generation of nonexpansive semigroups in complete CAT(0) spaces, Numer. Func. Anal. Opt., (2021), 1–30. https://doi.org/10.1080/01630563.2021.1931879
    [8] P. Chaipunya, P. kumam, On the proximal point method in Hadamard spaces, Optimization, 66 (2017), 1647–1665. https://doi.org/10.1080/02331934.2017.1349124 doi: 10.1080/02331934.2017.1349124
    [9] H. Dehghan, C. Izuchukwu, O. Mewomo, D. Taba, G. Ugwunnadi, Iterative algorithm for a family of monotone inclusion problems in CAT(0) spaces, Quaest. Math., 43 (2020), 975–998. https://doi.org/10.2989/16073606.2019.1593255 doi: 10.2989/16073606.2019.1593255
    [10] S. Dhompongsa, A Kaewkhao, B. Panyanak, On Kirk's strong convergence theorem for multivalued nonexpansive mappings on CAT(0) spaces, Nonlinear Anal. Theor., 75 (2012), 459–468. https://doi.org/10.1016/j.na.2011.08.046 doi: 10.1016/j.na.2011.08.046
    [11] S. Dhompongsa, W. Kirk, B. Panyanak, Nonexpansive set-valued mappings in metric and Banach spaces, J. Nonlinear Convex A., 8 (2007), 35. https://doi.org/10.1016/j.na.2011.08.046 doi: 10.1016/j.na.2011.08.046
    [12] S. Dhompongsa, W. A. Kirk, B. Sims, Fixed points of uniformly Lipschitzian mappings, Nonlinear Anal. Theor., 65 (2006), 762–772. https://doi.org/10.1016/j.na.2005.09.044 doi: 10.1016/j.na.2005.09.044
    [13] S. Dhompongsa, B. Panyanak On -convergence theorems in CAT(0) spaces, Comput. Math. Appl., 56 (2008), 2572–2579. https://doi.org/10.1016/j.camwa.2008.05.036 doi: 10.1016/j.camwa.2008.05.036
    [14] A. Feragen, S. Hauberg, M. Nielsen, F. Lauze, Means in spaces of tree-like shapes, International Conference on Computer Vision, (2011), 736–746. https://doi.org/10.1109/iccv.2011.6126311
    [15] K. Goebel, R. Simeon Uniform convexity, hyperbolic geometry, and nonexpansive mappings, Dekker, 1984. https://doi.org/10.1112/blms/17.3.293
    [16] O. Güler, On the convergence of the proximal point algorithm for convex minimization, SIAM J. Control Optim., 29 (1991), 403–419. https://doi.org/10.1137/0329022 doi: 10.1137/0329022
    [17] B. A. Kakavandi, M. Amini, Duality and subdifferential for convex functions on complete CAT(0) metric spaces, Nonlinear Anal. Theory., 73 (2010), 3450–3455. https://doi.org/10.1016/j.na.2010.07.033 doi: 10.1016/j.na.2010.07.033
    [18] S. Kamimura, W. Takahashi, Approximating solutions of maximal monotone operators in Hilbert spaces, J. Approx. Theory, 106 (2000), 226–240. https://doi.org/10.1006/jath.2000.3493 doi: 10.1006/jath.2000.3493
    [19] K. Khammahawong, P. Kumam, P. Chaipunya, J. Martínez-Moreno, Tseng's methods for inclusion problems on Hadamard manifolds, Optimization, (2021), 1–35. https://doi.org/10.1080/02331934.2021.1940179
    [20] K. Khammahawong, P. Kumam, P. Chaipunya, J. Yao, C. Wen, W. Jirakitpuwapat, An extragradient algorithm for strongly pseudomonotone equilibrium problems on Hadamard manifolds, Thai J. Math., 18 (2020), 350–371.
    [21] H. Khatibzadeh, S. Ranjbar, Monotone operators and the proximal point algorithm in complete CAT(0) metric spaces, J. Aus. Math. Soc., 103 (2017), 70–90. https://doi.org/10.1017/s1446788716000446 doi: 10.1017/s1446788716000446
    [22] W. A. Kirk, Fixed point theorems in spaces and-trees, Fixed Point Theory A., 4 (2004), 1–8. https://doi.org/10.1155/s1687182004406081 doi: 10.1155/s1687182004406081
    [23] W. A. Kirk, B. Panyanak, A concept of convergence in geodesic spaces, Nonlinear Anal. Theory A., 68 (2008), 3689–3696. https://doi.org/10.1016/j.na.2007.04.011 doi: 10.1016/j.na.2007.04.011
    [24] W. A. Kirk, N. Shahzad, Fixed point theory in distance spaces, Springer, 2014. https://doi.org/10.1007/978-3-319-10927-5
    [25] B. Martinet, Brève communication, Régularisation d'inéquations variationnelles par approximations successives, Revue française d'informatique et de recherche opérationnelle. Série rouge, 4 (1970), 154–158. https://doi.org/10.1051/m2an/197004r301541
    [26] I. Nikolaev, The tangent cone of an Aleksandrov space of curvature k, Manuscripta Math., 86 (1995), 137–147. https://doi.org/10.1007/bf02567983 doi: 10.1007/bf02567983
    [27] F. Ogbuisi, O. Mewomo, Iterative solution of split variational inclusion problem in a real Banach spaces, Afr. Mat., 28 (2017), 295–309. https://doi.org/10.1007/s13370-016-0450-z doi: 10.1007/s13370-016-0450-z
    [28] S. Ranjbar, H. Khatibzadeh, Strong and Δ-Convergence to a Zero of a Monotone Operator in CAT(0) Spaces, Mediterr. J. Math., 14 (2017), 1–15. https://doi.org/10.1007/s00009-017-0885-y doi: 10.1007/s00009-017-0885-y
    [29] S. Reich, I. Shafrir, Nonexpansive iterations in hyperbolic spaces, Nonlinear Anal. Theory, 15 (1990), 537–558. https://doi.org/10.1016/0362-546x(90)90058-o doi: 10.1016/0362-546x(90)90058-o
    [30] R. T. Rockafellar, Monotone operators and the proximal point algorithm, SIAM J. Control Optim., 14 (1976), 877–898. https://doi.org/10.1137/0314056 doi: 10.1137/0314056
    [31] Y. Shehu, X. Qin, J. Yao, Weak and linear convergence of proximal point algorithm with reflections, J. Nonlinear Convex Anal., 22 (2021), 299–307. https://doi.org/10.23952/jnva.5.2021.6.03 doi: 10.23952/jnva.5.2021.6.03
    [32] M. Sun, J. Liu, Y. Wang, Two improved conjugate gradient methods with application in compressive sensing and motion control, Math. Probl. Eng., 2020 (2020). https: //doi.org/10.1155/2020/9175496
    [33] W. Takahashi, K. Shimoji, Convergence theorems for nonexpansive mappings and feasibility problems, Math. Comput. Model., 32 (2000), 1463–1471. https://doi.org/10.1016/s0895-7177(00)00218-1 doi: 10.1016/s0895-7177(00)00218-1
    [34] G. Ugwunnadi, C. Izuchukwu, O. Mewomo, Strong convergence theorem for monotone inclusion problem in CAT(0) spaces, Afr. Mat., 30 (2019), 151–169. https://doi.org/10.1007/s13370-018-0633-x doi: 10.1007/s13370-018-0633-x
  • This article has been cited by:

    1. Ruihan Diao, Yang Lv, Yangyang Ding, Short-term power load comparison based on time series and neural networks considering multiple features, 2023, 2625, 1742-6588, 012002, 10.1088/1742-6596/2625/1/012002
    2. Lingcong Xu, Lanfeng Zhou, 2024, Short-Term Load Forecasting Based on TVF-EMD and CNN-GRU Optimized By DBO, 979-8-3503-0963-8, 1503, 10.1109/ACPEE60788.2024.10532328
    3. Mingju Chen, Fuhong Qiu, Xingzhong Xiong, Zhengwei Chang, Yang Wei, Jie Wu, BILSTM-SimAM: An improved algorithm for short-term electric load forecasting based on multi-feature, 2024, 21, 1551-0018, 2323, 10.3934/mbe.2024102
    4. Lei Dai, Haiying Wang, An Improved WOA (Whale Optimization Algorithm)-Based CNN-BIGRU-CBAM Model and Its Application to Short-Term Power Load Forecasting, 2024, 17, 1996-1073, 2559, 10.3390/en17112559
    5. Qifan Chen, Yunfei Ding, Kun Tian, Qiancheng Sun, 2025, Chapter 4, 978-3-031-73406-9, 33, 10.1007/978-3-031-73407-6_4
  • Reader Comments
  • © 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1871) PDF downloads(97) Cited by(5)

Figures and Tables

Figures(1)  /  Tables(1)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog