Ultra-short-term forecasting model of power load based on fusion of power spectral density and Morlet wavelet

Lihe Liang; Jinying Cui; Juanjuan Zhao; Yan Qiang; Qianqian Yang; Lihe Liang; Jinying Cui; Juanjuan Zhao; Yan Qiang; Qianqian Yang

doi:10.3934/mbe.2024150

Mathematical Biosciences and Engineering

2024, Volume 21, Issue 2: 3391-3421. doi: 10.3934/mbe.2024150

Previous Article Next Article

Research article

Ultra-short-term forecasting model of power load based on fusion of power spectral density and Morlet wavelet

1.
College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan 030600, China
2.
School of Software, Taiyuan University of Technology, Taiyuan 030600, China
3.
College of Information, Jinzhong College of Information, Jinzhong 030800, China

Academic Editor: Yang Kuang

Received: 18 December 2023 Revised: 15 January 2024 Accepted: 16 January 2024 Published: 04 February 2024

An accurate ultra-short-term time series prediction of a power load is an important guarantee for power dispatching and the safe operation of power systems. Problems of the current ultra-short-term time series prediction algorithms include low prediction accuracy, difficulty capturing the local mutation features, poor stability, and others. From the perspective of series decomposition, a multi-scale sequence decomposition model (TFDNet) based on power spectral density and the Morlet wavelet transform is proposed that combines the multidimensional correlation feature fusion strategy in the time and frequency domains. By introducing the time-frequency energy selection module, the "prior knowledge" guidance module, and the sequence denoising decomposition module, the model not only effectively delineates the global trend and local seasonal features, completes the in-depth information mining of the smooth trend and fluctuating seasonal features, but more importantly, realizes the accurate capture of the local mutation seasonal features. Finally, on the premise of improving the forecasting accuracy, single-point load forecasting and quantile probabilistic load forecasting for ultra-short-term load forecasting are realized. Through the experiments conducted on three public datasets and one private dataset, the TFDNet model reduces the mean square error (MSE) and mean absolute error (MAE) by 19.80 and 11.20% on average, respectively, as compared with the benchmark method. These results indicate the potential applications of the TFDNet model.

Keywords:

ultra-short-term time series prediction,
series decomposition,
global trend features,
local seasonal features,
quantile probabilistic load forecasting

Citation: Lihe Liang, Jinying Cui, Juanjuan Zhao, Yan Qiang, Qianqian Yang. Ultra-short-term forecasting model of power load based on fusion of power spectral density and Morlet wavelet[J]. Mathematical Biosciences and Engineering, 2024, 21(2): 3391-3421. doi: 10.3934/mbe.2024150

Related Papers:

[1]	Xiaotong Ji, Dan Liu, Ping Xiong . Multi-model fusion short-term power load forecasting based on improved WOA optimization. Mathematical Biosciences and Engineering, 2022, 19(12): 13399-13420. doi: 10.3934/mbe.2022627
[2]	Faisal Mehmood Butt, Lal Hussain, Anzar Mahmood, Kashif Javed Lone . Artificial Intelligence based accurately load forecasting system to forecast short and medium-term load demands. Mathematical Biosciences and Engineering, 2021, 18(1): 400-425. doi: 10.3934/mbe.2021022
[3]	Fengyong Li, Meng Sun . EMLP: short-term gas load forecasting based on ensemble multilayer perceptron with adaptive weight correction. Mathematical Biosciences and Engineering, 2021, 18(2): 1590-1608. doi: 10.3934/mbe.2021082
[4]	Mingju Chen, Fuhong Qiu, Xingzhong Xiong, Zhengwei Chang, Yang Wei, Jie Wu . BILSTM-SimAM: An improved algorithm for short-term electric load forecasting based on multi-feature. Mathematical Biosciences and Engineering, 2024, 21(2): 2323-2343. doi: 10.3934/mbe.2024102
[5]	Xiaoqiang Dai, Kuicheng Sheng, Fangzhou Shu . Ship power load forecasting based on PSO-SVM. Mathematical Biosciences and Engineering, 2022, 19(5): 4547-4567. doi: 10.3934/mbe.2022210
[6]	Peng Lu, Ao Sun, Mingyu Xu, Zhenhua Wang, Zongsheng Zheng, Yating Xie, Wenjuan Wang . A time series image prediction method combining a CNN and LSTM and its application in typhoon track prediction. Mathematical Biosciences and Engineering, 2022, 19(12): 12260-12278. doi: 10.3934/mbe.2022571
[7]	Xin Jing, Jungang Luo, Shangyao Zhang, Na Wei . Runoff forecasting model based on variational mode decomposition and artificial neural networks. Mathematical Biosciences and Engineering, 2022, 19(2): 1633-1648. doi: 10.3934/mbe.2022076
[8]	Keruo Jiang, Zhen Huang, Xinyan Zhou, Chudong Tong, Minjie Zhu, Heshan Wang . Deep belief improved bidirectional LSTM for multivariate time series forecasting. Mathematical Biosciences and Engineering, 2023, 20(9): 16596-16627. doi: 10.3934/mbe.2023739
[9]	Rami Al-Hajj, Gholamreza Oskrochi, Mohamad M. Fouad, Ali Assi . Probabilistic prediction intervals of short-term wind speed using selected features and time shift dependent machine learning models. Mathematical Biosciences and Engineering, 2025, 22(1): 23-51. doi: 10.3934/mbe.2025002
[10]	Xihe Qiu, Xiaoyu Tan, Chenghao Wang, Shaotao Chen, Bin Du, Jingjing Huang . A long short-temory relation network for real-time prediction of patient-specific ventilator parameters. Mathematical Biosciences and Engineering, 2023, 20(8): 14756-14776. doi: 10.3934/mbe.2023660

Abstract

1. Introduction

Power load forecasting is a vital component of power system operation and management, holding significant importance for the stable operation and intelligent scheduling of the power system ^[1]. However, with the advancement of the energy internet and the continual improvement of people's living standards, there has been a substantial increase in the volume and volatility of power load data. This surge in data poses significant challenges to the management of power system operations. In response to these challenges, ultra-short-term time series forecasting has become increasingly crucial. Ultra-short-term time series forecasting under power load refers to high-temporal-resolution predictions of power load for either the upcoming few hours or a day ^[2]. Ultra-short-term time series forecasting offers more accurate and rapid forecasts, which can provide decision support for energy management, demand response, stable power system operation, and intelligent scheduling. Research in ultra-short-term time series forecasting for power load is of paramount importance for enhancing the efficiency and stability of the power system ^[3].

Currently, research methods for ultra-short-term time series forecasting can be broadly categorized into two main groups: statistical-based methods and data-driven artificial neural network methods. The autoregressive integrated moving average (ARIMA) model ^[4] is one of the most widely applied statistical methods. The ARIMA model cleverly combines autoregression, moving average, and differencing operations along with other statistical techniques to predict stationary time series. However, with the sharp increase in historical data, data-driven artificial neural network methods have shown superior predictive performances over statistical methods. Convolutional neural networks (CNN) ^[5] are effective at extracting local features from time series data, though they are less capable of capturing temporal relationships between sequences. Alternatively, they are suitable for a time series with strong periodicity. On the other hand, recurrent neural networks (RNN) ^[6] are efficient at handling sequential data. This model leverages' memory cells to facilitate continuous information transfer within the network, thereby capturing the temporal dependencies between sequential data. However, the vanishing gradient problem limits the application of RNNs in sequence forecasting, which is a common challenge faced by most RNN models. DeepAR ^[7] is an RNN model that combines autoregressive methods with long short-term memory (LSTM) ^[8]. The core of DeepAR is to use LSTM to learn the dynamic features of historical sequences to achieve the prediction of future points in time. The residential load forecasting - multimodal graph neural network (RLF-MGNN) ^[9] is a model based on the combination of graph convolutional neural networks (GCN) and LSTM. The core of the model is to use the GCN to extract the linear and nonlinear features of the synchronization and causal graphs; then, the LSTM is used to achieve the ultra-short-term prediction of the sequence. Such variants of the model can effectively alleviate problems, such as gradient disappearance, that exist in the RNN, but cannot solve the recursive dependence, which leads to the problem of model performance degradation.

Recently, the Transformer model has demonstrated remarkable performance in sequence data analysis ^[10]. By utilizing attention mechanisms, Transformer can capture long-range dependencies among sequence elements and enable one-step computation, thus addressing the recursive dependency challenge faced by previous RNN models. Consequently, numerous variants of the Transformer model have emerged. For instance, LogTrans ^[11] is a transformer model based on logarithmic transformations, which uses exponentially growing time intervals for attention calculation. Reformer ^[12] introduces a position-sensitive, hash-based attention mechanism as an alternative to traditional attention mechanisms. Informer ^[13] employs a KL divergence-based sparse attention mechanism and a distillation strategy, while producing one-step output predictions to avoid error accumulation in the model results. KL divergence refers to Kullback-Leibler divergence, a mathematical concept used to measure the difference between two probability distributions. However, the attention mechanisms in these models still operate at the point level and do not fully exploit the correlations between similar sequences. Additionally, there is room for improvement in terms of the interpretability of these models.

The sequence decomposition model is an encoder-decoder model with a transformer at its core, incorporating traditional time series decomposition. The idea of this model is to decompose a time series into components with evident regularity, such as trend and seasonality, followed by separate learning of these components and progressive aggregation. As shown in Figure 1, the autocorrelation function (ACF) and partial autocorrelation function (PACF) between different lagged time series values on the ETTm1, ETTm2, Weather, and PowerLoad datasets reveal a significant correlation between the current value of the sequence and its lagged values. Furthermore, Figure 2 demonstrates that the trend and seasonality components obtained through an additive model align with the patterns of the original sequence. Therefore, the idea of time series decomposition exhibits strong feasibility. For example, based on the stochastic process theory, Wu et al. proposed a sequence decomposition model, called Autoformer ^[14], with an autocorrelation attention mechanism instead of a point-level connection. The Autoformer model decomposes the sequence into trend and seasonal terms and utilizes the autocorrelation attention mechanism to discover cycle-based sequence-level dependencies, realize the sequence-level connection, and break the bottleneck of information utilization. Meanwhile, Autoformer enhances the model's interpretability from the perspective of sequence decomposition. However, sequence analysis from a time-domain perspective makes it difficult to capture a more complete pattern of cycles on the one hand and compounds the computational cost of the model on the other.

Figure 1. ACF and PACF plots.

Methods		Informer		Autoformer		FEDformer		TimesNet			RLF-MGNN			EEMD-DARNN			TFDNet
Metric		MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE	MSE		MAE	MSE		MAE	MSE		MAE
ETTm1	96	0.111	0.278	0.055	0.185	0.038	0.150	0.037	0.148	0.038		0.151	0.031		0.140	0.033		0.142
	192	0.151	0.313	0.080	0.216	0.065	0.204	0.063	0.201	0.065		0.199	0.060		0.194	0.058		0.209
	336	0.428	0.595	0.087	0.218	0.074	0.213	0.071	0.211	0.072		0.213	0.071		0.215	0.067		0.210
	720	0.438	0.588	0.111	0.264	0.107	0.254	0.094	0.231	0.100		0.253	0.093		0.241	0.088		0.225
ETTm2	96	0.088	0.225	0.068	0.189	0.067	0.191	0.067	0.192	0.071		0.203	0.063		0.191	0.061		0.187
	192	0.134	0.283	0.119	0.258	0.114	0.249	0.101	0.244	0.104		0.251	0.109		0.248	0.094		0.241
	336	0.180	0.337	0.154	0.307	0.140	0.303	0.135	0.300	0.143		0.299	0.134		0.294	0.122		0.281
	720	0.302	0.440	0.184	0.343	0.215	0.373	0.192	0.351	0.198		0.355	0.185		0.333	0.172		0.329
Weather	96	0.0042	0.046	0.018	0.082	0.0037	0.048	0.0033	0.041	0.0038		0.047	0.0031		0.038	0.0029		0.031
	192	0.0025	0.043	0.0073	0.072	0.0058	0.062	0.0043	0.051	0.0048		0.059	0.0045		0.055	0.0040		0.044
	336	0.0044	0.052	0.0065	0.064	0.008	0.077	0.0057	0.068	0.0073		0.067	0.0056		0.064	0.0041		0.049
	720	0.0043	0.049	0.0085	0.074	0.018	0.094	0.0049	0.081	0.0051		0.064	0.0041		0.049	0.0036		0.047
PowerLoad	96	0.367	0.362	0.254	0.264	0.204	0.234	0.188	0.217	0.193		0.305	0.172		0.211	0.178		0.325
	192	0.422	0.424	0.303	0.345	0.283	0.345	0.250	0.319	0.288		0.351	0.284		0.350	0.232		0.355
	336	0.534	0.583	0.403	0.452	0.386	0.395	0.385	0.370	0.395		0.401	0.391		0.393	0.381		0.373
	720	0.644	0.743	0.534	0.537	0.430	0.434	0.427	0.431	0.447		0.453	0.436		0.447	0.423		0.427
Count		1	1	0	0	0	0	0	2	0		0	2		3	13		10

Model	ETTm1			ETTm2
Metric	P10	P50	P90	P10	P50	P90
Autoformer	0.203 (+17.73%)	0.248 (+28.22%)	0.104 (+42.3%)	0.220 (+27.73%)	0.237 (+19.41%)	0.112 (+34.82%)
TimesNet	0.188 (+11.17%)	0.205 (+13.17%)	0.081 (+25.93%)	0.207 (+23.19%)	0.214 (+10.75%)	0.096 (+23.96%)
RLF-MGNN	0.195 (+14.36%)	0.227 (+21.59%)	0.088 (+31.82%)	0.213 (+25.35%)	0.224 (+14.73%)	0.101 (+27.72%)
EEMD-DARNN	0.176 (+5.11%)	0.194 (+8.25%)	0.077 (+22.08%)	0.194 (+18.04%)	0.204 (+6.37%)	0.088 (+17.05%)
TFDNet	0.167	0.178	0.060	0.159	0.191	0.073
Model	Weather			PowerLoad
Metric	P10	P50	P90	P10	P50	P90
Autoformer	0.053 (+28.30%)	0.073 (+21.92%)	0.029 (+27.59%)	0.135 (+14.81%)	0.148 (+9.45%)	0.121 (+28.09%)
TimesNet	0.049 (+22.45%)	0.062 (+8.06%)	0.025 (+16%)	0.118 (+2.54%)	0.139 (+3.60%)	0.091 (+4.40%)
RLF-MGNN	0.051 (+25.49%)	0.070 (+18.57%)	0.029 (+27.59)	0.127 (+9.45%)	0.144 (+6.94%)	0.104 (+16.35%)
EEMD-DARNN	0.041 (+7.32%)	0.061 (+6.56%)	0.022 (+4.55%)	0.119 (+3.36%)	0.138 (+2.90%)	0.093 (+6.45%)
TFDNet	0.038	0.057	0.021	0.115	0.134	0.087

DataSet	ETTm1				PowerLoad
Predicted Length	96	192	336	720	96	192	336	720
1	0.055	0.080	0.087	0.111	0.254	0.303	0.403	0.534
2					0.250	0.302	0.401	0.530
3	0.037	0.061	0.069	0.089	0.194	0.257	0.388	0.447
4					0.188	0.252	0.386	0.445
5	0.041	0.069	0.075	0.092	0.239	0.298	0.399	0.521
6	−	−	−	−	0.231	0.292	0.398	0.516
7	−	−	−	−	0.180	0.241	0.382	0.428
8	0.033	0.058	0.067	0.088	0.178	0.232	0.381	0.423

[1]	A. Haque, S. Rahman, Short-term electrical load forecasting through heuristic configuration of regularized deep neural network, Appl. Soft Comput., 122 (2022), 108877. https://doi.org/10.1016/j.asoc.2022.108877 doi: 10.1016/j.asoc.2022.108877
[2]	J. Ma, M. Yang, X. Han, Z. Li, Ultra-short-term wind generation forecast based on multivariate empirical dynamic modeling, IEEE Trans. Ind. Appl., 54 (2017), 1029–1038. https://doi.org/10.1109/TIA.2017.2782207 doi: 10.1109/TIA.2017.2782207
[3]	Y. Dai, X. Yang, M. Leng, Forecasting power load: A hybrid forecasting method with intelligent data processing and optimized artificial intelligence, Technol. Forecast. Soc. Change, 182 (2022), 121858. https://doi.org/10.1109/TIA.2017.2782207 doi: 10.1109/TIA.2017.2782207
[4]	R. Ospina, A. Gondim, V. Leiva, C Castro, An overview of forecast analysis with ARIMA models during the COVID-19 pandemic: Methodology and case study in Brazil, Mathematics, 11 (2023), 3069. https://doi.org/10.3390/math11143069 doi: 10.3390/math11143069
[5]	F. Yuan, Z. Zhang, Z. Fang, An effective CNN and Transformer complementary network for medical image segmentation, Pattern Recognit., 136 (2023), 109228. https://doi.org/10.1016/j.patcog.2022.109228 doi: 10.1016/j.patcog.2022.109228
[6]	M. Saraswat, Srishti, Leveraging genre classification with RNN for Book recommendation, Int. J. Inf. Technol., 14 (2022), 3751–3756. https://doi.org/10.1007/s41870-022-00937-6 doi: 10.1007/s41870-022-00937-6
[7]	D. Salinas, V. Flunkert, J. Gasthaus, T. Januschowski, DeepAR: Probabilistic forecasting with autoregressive recurrent networks, Int. J. Forecast., 36 (2020), 1181–1191. https://doi.org/10.1016/j.ijforecast.2019.07.001 doi: 10.1016/j.ijforecast.2019.07.001
[8]	W. Zha, Y. Liu, Y. Wan, R. Luo, D. Li, S. Yang, et al., Forecasting monthly gas field production based on the CNN-LSTM model, Energy, 260 (2022), 124889. https://doi.org/10.1016/j.energy.2022.124889 doi: 10.1016/j.energy.2022.124889
[9]	Y. Wang, L. Rui, J. Ma, A short-term residential load forecasting scheme based on the multiple correlation-temporal graph neural networks, Appl. Soft Comput., 146 (2023), 110629. https://doi.org/10.1016/j.asoc.2023.110629 doi: 10.1016/j.asoc.2023.110629
[10]	B. Tang, D. S. Matteson, Probabilistic transformer for time series analysis, Adv. Neural Inf. Process. Syst., 34 (2021), 23592–23608.
[11]	S. Li, X. Jin, Y. Xuan, X. Zhou, W. Chen, Y. X. Wang, et al., Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting, in Advances in Neural Information Processing Systems 32 (NeurIPS 2019), 2019.
[12]	N. Kitaev, L. Kaiser, A. Levskaya, Reformer: The efficient transformer, preprint, arXiv: 200104451.
[13]	H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, et al., Informer: Beyond efficient transformer for long sequence time-series forecasting, in Proceedings of the AAAI Conference on Artificial Intelligence, (2021), 11106–11115. https://doi.org/10.1609/aaai.v35i12.17325
[14]	H. Wu, J. Xu, J. Wang, M. Long, Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting, in Advances in Neural Information Processing Systems, (2021), 22419–22430.
[15]	T. Zhou, Z. Ma, Q. Wen, X. Wang, L. Sun, R. Jin, Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting, in Proceedings of the 39th International Conference on Machine Learning, (2022), 27268–27286.
[16]	H. Wu, T. Hu, Y. Liu, H. Zhou, J. Wang, M. Long, TimesNet: Temporal 2D-variation modeling for general time series analysis, preprint, arXiv: 221002186.
[17]	F. Yang, X. Fu, Q. Yang, Z. Chu, Decomposition strategy and attention-based long short-term memory network for multi-step ultra-short-term agricultural power load forecasting, Expert Syst. Appl., 238 (2024), 122226. https://doi.org/10.1016/j.eswa.2023.122226 doi: 10.1016/j.eswa.2023.122226
[18]	T. Donoghue, M. Haller, E. J. Peterson, P. Varma, P. Sebastian, R. Gao, et al., Parameterizing neural power spectra into periodic and aperiodic components, Nat. Neurosci., 23 (2020), 1655–1665. https://doi.org/10.1038/s41593-020-00744-x doi: 10.1038/s41593-020-00744-x
[19]	H. Huang, L. Lin, R. Tong, H. Hu, Q. Zhang, Y. Iwamoto, et al., Unet 3+: A full-scale connected unet for medical image segmentation, in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, 1055–1059. https://doi.org/10.1109/ICASSP40776.2020.9053405
[20]	T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, S. Belongie, Feature pyramid networks for object detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 2117–2125.
[21]	B. H. Chen, Y. S. Tseng, J. L. Yin, Gaussian-adaptive bilateral filter, IEEE Signal Process. Lett., 27 (2020), 1670–1674. https://doi.org/10.1109/LSP.2020.3024990 doi: 10.1109/LSP.2020.3024990
[22]	Z. Ni, C. Zhang, M. Karlsson, S. Gong, A study of deep learning-based multi-horizon building energy forecasting, Energy Build., 203 (2024), 113810. https://doi.org/10.1016/j.enbuild.2023.113810 doi: 10.1016/j.enbuild.2023.113810
[23]	J. R. Lanzante, Testing for differences between two distributions in the presence of serial correlation using the Kolmogorov–Smirnov and Kuiper's tests, Int. J. Climatol., 41 (2021), 6314–6323. https://doi.org/10.1002/joc.7196 doi: 10.1002/joc.7196

Algorithm: Global energy selection module
Input: sequence value $X\in {\mathbb{R}}^{L\times N}$ , where $L$ represents the length of the sequence, $N$ represents the number of sequence dimensions, and ${d}_{model}$ represents the number of feature dimensions.
Output: $SG\_out$
1. ${X}_{emb} = \mathrm{E}\mathrm{m}\mathrm{b}\mathrm{e}\mathrm{d}\mathrm{d}\mathrm{i}\mathrm{n}\mathrm{g}\left(X\right)$
X $\in {\mathbb{R}}^{L\times N}, \; {X}_{emb}\in {\mathbb{R}}^{L\times {d}_{model}}$
2. ${X}_{emb1} = \mathrm{c}\mathrm{o}\mathrm{n}\mathrm{v}1\mathrm{D}\left({X}_{emb}\right), {X}_{emb2} = \mathrm{c}\mathrm{o}\mathrm{n}\mathrm{v}1\mathrm{D}\left({X}_{emb1}\right)$
${X}_{emb1}\in {\mathbb{R}}^{\frac{L}{2}\times {d}_{model}}, \; {X}_{emb2}\in {\mathbb{R}}^{\frac{L}{4}\times {d}_{model}}$
3.for $j\in \{emb, emb1, emb2\}$ do (The dimensional display in the loop is exemplified by $\text{e}mb$ .)
$3.1.Q, K, V = \mathrm{L}\mathrm{i}\mathrm{n}\mathrm{e}\mathrm{a}\mathrm{r}\left({X}_{j}\right), \mathrm{L}\mathrm{i}\mathrm{n}\mathrm{e}\mathrm{a}\mathrm{r}\left({X}_{j}\right), \mathrm{L}\mathrm{i}\mathrm{n}\mathrm{e}\mathrm{a}\mathrm{r}\left({X}_{j}\right)$
Q, K, V $\in {\mathbb{R}}^{L\times {d}_{model}}$
$3.2.{PSD}_{QK} = \mathrm{F}\mathrm{F}\mathrm{T}(Q, dim = 0)\cdot {\mathrm{F}\mathrm{F}\mathrm{T}}^{*}(K, dim = 0)$
$PSD\in {\mathbb{R}}^{\frac{L}{2}\times {d}_{model}}$
$3.3.frequency\_list, weight\_list = \mathrm{A}\mathrm{r}\mathrm{g}\mathrm{T}\mathrm{o}\mathrm{p}\mathrm{k}\left({PSD}_{QK}\right)$
$frequency\_list, weight\_list\in k$
$3.4.period\_list = \frac{L}{frequency\_list}$
$period\_list\in k$
$3.5.\overline{weight\_list} = \mathrm{S}\mathrm{o}\mathrm{f}\mathrm{t}\mathrm{M}\mathrm{a}x\left({w}_{1}, \cdots, {w}_{k}\right)$
$\overline{weight\_list}\in k$
$3.6.{E}_{j}\left(Q, K, V\right) = \sum _{i = 1}^{k} \mathrm{Roll}\left(V, {p}_{i}\right)\overline{weight\_list}$
end for
4. $\overline{{E}_{1}} = \mathrm{c}\mathrm{o}\mathrm{n}\mathrm{v}\mathrm{t}\mathrm{r}\mathrm{a}\mathrm{n}\mathrm{s}1\mathrm{D}\left({E}_{1}\right), \overline{{E}_{2}} = \mathrm{c}\mathrm{o}\mathrm{n}\mathrm{v}\mathrm{t}\mathrm{r}\mathrm{a}\mathrm{n}\mathrm{s}1\mathrm{D}\left(\mathrm{c}\mathrm{o}\mathrm{n}\mathrm{v}\mathrm{t}\mathrm{r}\mathrm{a}\mathrm{n}\mathrm{s}1\mathrm{D}\right({E}_{2}\left)\right)$
$E\in {\mathbb{R}}^{L\times {d}_{model}}, {E}_{1}\in {\mathbb{R}}^{\frac{L}{2}\times {d}_{model}}, {E}_{2}\in {\mathbb{R}}^{\frac{L}{4}\times {d}_{model}}, \overline{{E}_{1}}\in {\mathbb{R}}^{L\times {d}_{model}}, \overline{{E}_{2}}\in {\mathbb{R}}^{L\times {d}_{model}}$
5 $.SG\_out = \mathrm{L}\mathrm{i}\mathrm{n}\mathrm{e}\mathrm{a}\mathrm{r}$ ( $E, \overline{{E}_{1}}, \overline{{E}_{2}}$ )
$SG\_out\in {\mathbb{R}}^{L\times {d}_{model}}$

Algorithm: Local energy selection module
Input: sequence value X $\in {\mathbb{R}}^{L\times N}$ , where $L$ represents the length of the sequence, $N$ represents the number of sequence dimensions, and ${d}_{model}$ represents the number of feature dimensions. Output: $SL\_out、TL\_out$
1. ${X}_{emb} = \mathrm{E}\mathrm{m}\mathrm{b}\mathrm{e}\mathrm{d}\mathrm{d}\mathrm{i}\mathrm{n}\mathrm{g}\left(X\right)$
X $\in {\mathbb{R}}^{L\times N}, \; {X}_{emb}\in {\mathbb{R}}^{L\times {d}_{model}}$
$2.W\left(a, b\right) = {\left\|{\int }_{-\mathrm{\infty }}^{+\mathrm{\infty }} {X}_{t}*{\overline{\psi }}_{a, b}\left(t\right)dt\right\|}^{2}$
$W\left(a, b\right)\in {\mathbb{R}}^{L\times {d}_{model}}$
$3.\left\{{W}_{1}, {W}_{2}\cdots, {W}_{k}\right\}, \left\{{W}_{k+1}, {W}_{k+2}\cdots, {W}_{n}\right\} = \mathrm{A}\mathrm{r}\mathrm{g}\text{Topk}\text{(}W\left(a, b\right))$
${W}_{k}\in {\mathbb{R}}^{1\times {d}_{model}}, \; {W}_{n}\in {\mathbb{R}}^{1\times {d}_{model}}$
4.big $\text{_}scales = \left\{{W}_{1}, {W}_{2}\cdots, {W}_{k}\right\}, small\_scales = \left\{{W}_{k+1}, {W}_{k+2}\cdots, {W}_{n}\right\}$
$big\_scales\in {\mathbb{R}}^{k\times {d}_{model}}, small\_scales\in {\mathbb{R}}^{(n-k)\times {d}_{model}}$
5. $SL\_out = \mathrm{L}\mathrm{i}\mathrm{n}\mathrm{e}\mathrm{a}\mathrm{r}\left({S}_{1}+{S}_{2}+{S}_{3}+{S}_{4}\right)$ includes: 5.1. ${S}_{1} = \mathrm{B}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h}\mathrm{N}\mathrm{o}\mathrm{r}\mathrm{m}\left({\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{v}}_{1\times 1}\left(small\_scales\right)\right)$ 5.2. ${S}_{2} = \mathrm{B}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h}\mathrm{ }\mathrm{N}\mathrm{o}\mathrm{r}\mathrm{m}\left({\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{v}}_{k\times 1}\left(\mathrm{B}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h}\mathrm{ }\mathrm{N}\mathrm{o}\mathrm{r}\mathrm{m}\left({\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{v}}_{1\times 1}\left(small\_scales\right)\right)\right)\right)$ $5.3.{S}_{3} = \mathrm{B}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h}\mathrm{N}\mathrm{o}\mathrm{r}\mathrm{m}\left({\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{v}}_{1\times k}\left(\mathrm{B}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h}\mathrm{N}\mathrm{o}\mathrm{r}\mathrm{m}\left({\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{v}}_{1\times 1}\left(small\_scales\right)\right)\right)\right)$ $5.4.{S}_{4} = \mathrm{B}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h}\mathrm{N}\mathrm{o}\mathrm{r}\mathrm{m}\left({\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{v}}_{k\times k}\left(small\_scales\right)\right)$
$6.TL\_out = \mathrm{L}\mathrm{i}\mathrm{n}\mathrm{e}\mathrm{a}\mathrm{r}\left({T}_{1}+{T}_{2}+{T}_{3}\right)$ includes: 6.1. ${T}_{1} = \mathrm{B}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h}\mathrm{N}\mathrm{o}\mathrm{r}\mathrm{m}\left({\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{v}}_{1\times 1}\left(big\_scales\right)\right)$ 6.2. ${T}_{2} = \mathrm{B}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h}\mathrm{N}\mathrm{o}\mathrm{r}\mathrm{m}\left({\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{v}}_{k\times k}\left(\mathrm{B}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h}\mathrm{N}\mathrm{o}\mathrm{r}\mathrm{m}\left({\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{v}}_{1\times 1}\left(big\_scales\right)\right)\right)\right)$ 6.3. ${T}_{3} = \mathrm{B}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h}\mathrm{N}\mathrm{o}\mathrm{r}\mathrm{m}\left({\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{v}}_{k\times k}\left(big\_scales\right)\right)$

Algorithm: "Prior knowledge" guidance module
Input: a sequence of covariates $G\in {\mathbb{R}}^{(\frac{L}{2}+\tau)\times M}$ , and a sequence of seasonal terms $S\in {\mathbb{R}}^{(\frac{L}{2}+\tau)\times N}$ initialized to 0. $L$ represents the length of the input sequence in the encoder, $\tau$ represents the length of the prediction, and ${d}_{model}$ represents the size of the feature dimensions. $k$ represents the number of selection period subsequences, $N$ represents the number of sequence dimensions, and $M$ represents the number of "known" relevant covariates in the future.
Output: $E$
1. ${S}_{emb} = \mathrm{E}\mathrm{m}\mathrm{b}\mathrm{e}\mathrm{d}\mathrm{d}\mathrm{i}\mathrm{n}\mathrm{g}\left(S\right), {G}_{emb} = \mathrm{E}\mathrm{m}\mathrm{b}\mathrm{e}\mathrm{d}\mathrm{d}\mathrm{i}\mathrm{n}\mathrm{g}\left(G\right)$
${S}_{emb}\in {\mathbb{R}}^{(\frac{L}{2}+\tau)\times {d}_{model}}, {S}_{emb}\in {\mathbb{R}}^{(\frac{L}{2}+\tau)\times {d}_{model}}$
2. $Q, K, V = \mathrm{L}\mathrm{i}\mathrm{n}\mathrm{e}\mathrm{a}\mathrm{r}\left({S}_{emb}\right), \mathrm{L}\mathrm{i}\mathrm{n}\mathrm{e}\mathrm{a}\mathrm{r}\left({S}_{emb}\right), \mathrm{L}\mathrm{i}\mathrm{n}\mathrm{e}\mathrm{a}\mathrm{r}\left({S}_{emb}\right)$
Q, K, V $\in {\mathbb{R}}^{(\frac{L}{2}+\tau)\times {d}_{model}}$
$3.\overline{Q}, \overline{K} = \mathrm{L}\mathrm{i}\mathrm{n}\mathrm{e}\mathrm{a}\mathrm{r}\left({G}_{emb}\right), \mathrm{L}\mathrm{i}\mathrm{n}\mathrm{e}\mathrm{a}\mathrm{r}\left({G}_{emb}\right)$
$\overline{Q}, \overline{K}\in {\mathbb{R}}^{(\frac{L}{2}+\tau)\times {d}_{model}}$
4. ${PSD}_{QK} = \mathrm{F}\mathrm{F}\mathrm{T}\left(Q\right)\cdot {\mathrm{F}\mathrm{F}\mathrm{T}}^{}\left(K\right), {PSD}_{\overline{Q}\overline{K}} = \mathrm{F}\mathrm{F}\mathrm{T}\left(\overline{Q}\right)\cdot {\mathrm{F}\mathrm{F}\mathrm{T}}^{}\left(\overline{K}\right)$
${PSD}_{QK}, {PSD}_{\overline{Q}\overline{K}}\in {\mathbb{R}}^{(\frac{L}{4}+\frac{\tau }{2})\times {d}_{model}}$
5. $frequency\_list, weight\_list = \mathrm{A}\mathrm{r}\mathrm{g}\mathrm{T}\mathrm{o}\mathrm{p}\mathrm{k}\left({PSD}_{QK}+{PSD}_{\overline{Q}\overline{K}}\right)$
$frequency\_list, weight\_list\in k$
6. $period\_list = \frac{\frac{L}{2}+\tau }{frequency\_list}$
$period\_list\in k$
7. $\overline{weight\_list} = \mathrm{S}\mathrm{o}\mathrm{f}\mathrm{t}\mathrm{M}\mathrm{a}x\left({w}_{1}, \cdots, {w}_{k}\right)$
$\overline{weight\_list}\in k$
8.E $(Q, K, V, \overline{Q}, \overline{K}) = \sum _{i = 1}^{k} \mathrm{Roll}\left(V, {p}_{i}\right)\overline{weight\_list}$
E $\in {\mathbb{R}}^{(\frac{L}{2}+\tau)\times {d}_{model}}$

Methods		Autoformer	FEDformer	EEMD-DARNN	TFDNet	True
Metric		$P-value$	$P-value$	$P-value$	$P-value$	$P-value$
ETTm1	96	0.029	0.050	0.044	0.042	0.031
	192	0.016	0.015	0.014	0.027	0.024
	336	0.009	0.012	0.008	0.018	0.017
	720	0.006	0.010	0.003	0.009	0.005
PowerLoad	96	0.023	0.028	0.039	0.046	0.034
	192	0.020	0.019	0.018	0.018	0.015
	336	0.007	0.010	0.006	0.012	0.013
	720	0.003	0.008	0.003	0.009	0.011
Count		1	2	0	5	NA

Mathematical Biosciences and Engineering

Ultra-short-term forecasting model of power load based on fusion of power spectral density and Morlet wavelet

Related Papers:

Abstract

1. Introduction

2. Materials and methods

2.1. Time-frequency energy selection module

2.1.1. Global energy selection module

2.1.2. Local energy selection module

2.1.3. Period-weighted feature fusion module

2.2. "Prior knowledge" guidance module

2.3. Sequence denoising decomposition module

2.4. Probabilistic load prediction

2.5. Evaluation metrics

2.6. Data preprocessing methods

3. Results

3.1. Datasets

3.2. Baselines

3.3. Implementation details

3.4. Pretreatment

3.5. Comparison experiments

3.6. Ablation studies

4. Discussion and analysis

4.1. Time-frequency energy selection module distribution analysis

4.1.1. Global distribution analysis

4.1.2. Localized distribution analysis

4.2. "Priori knowledge" guidance module distribution analysis

4.3. Analytical experiments with sequence denoising decomposition modules

4.4. Predicted output distribution analysis

4.5. Parametric analysis

5. Conclusions

Use of AI tools declaration

Acknowledgments

Conflict of interest

References

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog

Abstract

1. Introduction

2. Materials and methods

2.1. Time-frequency energy selection module

2.1.1. Global energy selection module

2.1.2. Local energy selection module

2.1.3. Period-weighted feature fusion module

2.2. "Prior knowledge" guidance module

2.3. Sequence denoising decomposition module

2.4. Probabilistic load prediction

2.5. Evaluation metrics

2.6. Data preprocessing methods

3. Results

3.1. Datasets

3.2. Baselines

3.3. Implementation details

3.4. Pretreatment

3.5. Comparison experiments

3.6. Ablation studies

4. Discussion and analysis

4.1. Time-frequency energy selection module distribution analysis

4.1.1. Global distribution analysis

4.1.2. Localized distribution analysis

4.2. "Priori knowledge" guidance module distribution analysis

4.3. Analytical experiments with sequence denoising decomposition modules

4.4. Predicted output distribution analysis

4.5. Parametric analysis

5. Conclusions

Use of AI tools declaration

Acknowledgments

Conflict of interest

References