
Our purpose in this paper is to initiate and study the notions of fuzzy subnear-semirings and fuzzy soft subnear-semirings. We study few of their elementary properties by providing suitable examples. Moreover, we present the characterizations of zero symmetric near-semirings (seminearrings) through their fuzzy ideals and fuzzy soft ideals. Fuzzy soft anti-homomorphism of fuzzy soft near-semirings and fuzzy soft R-homomorphisms of fuzzy soft R-subsemigroups are also introduced and discussed.
Citation: Abdelghani Taouti, Waheed Ahmad Khan. Fuzzy subnear-semirings and fuzzy soft subnear-semirings[J]. AIMS Mathematics, 2021, 6(3): 2268-2286. doi: 10.3934/math.2021137
[1] | Morteza Fotouhi, Andreas Minne, Henrik Shahgholian, Georg S. Weiss . Remarks on the decay/growth rate of solutions to elliptic free boundary problems of obstacle type. Mathematics in Engineering, 2020, 2(4): 698-708. doi: 10.3934/mine.2020032 |
[2] | Daniela De Silva, Ovidiu Savin . Uniform density estimates and Γ-convergence for the Alt-Phillips functional of negative powers. Mathematics in Engineering, 2023, 5(5): 1-27. doi: 10.3934/mine.2023086 |
[3] | Catharine W. K. Lo, José Francisco Rodrigues . On an anisotropic fractional Stefan-type problem with Dirichlet boundary conditions. Mathematics in Engineering, 2023, 5(3): 1-38. doi: 10.3934/mine.2023047 |
[4] | Filippo Gazzola, Gianmarco Sperone . Remarks on radial symmetry and monotonicity for solutions of semilinear higher order elliptic equations. Mathematics in Engineering, 2022, 4(5): 1-24. doi: 10.3934/mine.2022040 |
[5] | Donatella Danielli, Rohit Jain . Regularity results for a penalized boundary obstacle problem. Mathematics in Engineering, 2021, 3(1): 1-23. doi: 10.3934/mine.2021007 |
[6] | Aleksandr Dzhugan, Fausto Ferrari . Domain variation solutions for degenerate two phase free boundary problems. Mathematics in Engineering, 2021, 3(6): 1-29. doi: 10.3934/mine.2021043 |
[7] | Hugo Tavares, Alessandro Zilio . Regularity of all minimizers of a class of spectral partition problems. Mathematics in Engineering, 2021, 3(1): 1-31. doi: 10.3934/mine.2021002 |
[8] | Tatsuya Miura . Polar tangential angles and free elasticae. Mathematics in Engineering, 2021, 3(4): 1-12. doi: 10.3934/mine.2021034 |
[9] | Matteo Novaga, Marco Pozzetta . Connected surfaces with boundary minimizing the Willmore energy. Mathematics in Engineering, 2020, 2(3): 527-556. doi: 10.3934/mine.2020024 |
[10] | Daniela De Silva, Giorgio Tortone . Improvement of flatness for vector valued free boundary problems. Mathematics in Engineering, 2020, 2(4): 598-613. doi: 10.3934/mine.2020027 |
Our purpose in this paper is to initiate and study the notions of fuzzy subnear-semirings and fuzzy soft subnear-semirings. We study few of their elementary properties by providing suitable examples. Moreover, we present the characterizations of zero symmetric near-semirings (seminearrings) through their fuzzy ideals and fuzzy soft ideals. Fuzzy soft anti-homomorphism of fuzzy soft near-semirings and fuzzy soft R-homomorphisms of fuzzy soft R-subsemigroups are also introduced and discussed.
Volatility forecasting is essential in asset pricing, portfolio allocation and risk management research. Early volatility forecasting was based on economic models. The most famous economic models are the ARCH model [1] and the GARCH model [2], which can capture volatility clustering and heavy-tail features. However, they fail to capture asymmetry, such as leverage effects. The leverage effect is due to the fact that negative returns have a more significant impact on future volatility than positive returns. To overcome this drawback, the exponential GARCH (EGARCH) model [3] and GJR model [4] were proposed. In the following years, new volatility models based on the GARCH model emerged, such as the stochastic volatility model [5] proposed by Hull and White and the realized volatility model [6] offered by Blair et al. They formed a class of GARCH-type volatility models for financial markets.
The traditional GARCH model has strict constraints and requires the financial time series to satisfy the stationarity condition. It usually assumes conditional variances have a linear relationship with previous errors and previous variances. However, many financial time series show certain nonstationary and nonlinear characteristics in practice. Consequently, some extended model from GARCH is necessary to study the volatility of these time series.
With the development of computer and big data technologies, machine learning brings new ideas to volatility forecasting [7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]. Especially artificial neural networks(ANN) have shown an outstanding performance. It derives its computational ideas from biological neurons and is now widely used in various fields.
In financial risk analysis, researchers have utilized neural networks to study the volatility of financial markets. Hamid and Iqbal [22] apply ANN to predict the S&P500 index implied volatility, finding that ANN's forecasting performance surpasses that of the American option pricing model. Livieris proposes an artificial neural network prediction model for forecasting gold prices and trends [23]. Additionally, Dunis and Huang [24] explore neural network regression (NNR), recurrent neural networks, and their collaborative NNR-RNN models for predicting and trading the volatility of daily exchange rates of GBP/USD and USD/JPY, with results indicating that RNNs have the best volatility forecasting performance. Beyond the direct application of neural networks, researchers have investigated a series of mixture models [25,26,27,28,29,30,31] that combine ANNs and GARCH models. Liu et al. [32] introduce a volatility forecasting model based on the recurrent neural network (RNN) and the GARCH model. Experiments reveal that such mixture models enhance the predictive capabilities of traditional GARCH models, capturing normality, skewness and kurtosis of financial volatility more accurately.
This study employs a mixture model (DeepAR-GMM-GARCH) that combines the deep autoregressive network, the Gaussian mixture model and the GARCH model for probabilistic volatility forecasting. First, we discuss the design of the mixture model; Second, the article presents the model's inference and the training algorithm of the model; Third, we conduct a simulation experiment using artificial data and compare the outcomes with traditional GARCH models, finding that our model yields smaller RMSE and MAE; Last, we investigate the correlation between the square of extreme values and the square of returns for the CSI300 index. The empirical data is partitioned into training and test sets. After training and testing, we analyze the prediction results and observe that our proposed model outperforms other models in both in-sample and out-of-sample analyses.
The key offerings presented in this article can be summarized as follows: Initially, this article introduces a novel conditional volatility probability prediction model, which addresses the leptokurtic and heavy-tail traits of conventional financial volatility. This model is built upon a deep autoregressive network combined with a Gaussian mixture distribution. Subsequently, we incorporate extreme values into the mixture model via the neural network. It is discovered that the inclusion of extreme values enhances the accuracy of volatility predictions.
The structure of this paper is as follows: Section 2 outlines the GARCH model and the deep autoregressive network. Section 3 delves into the mixture model, elaborating on inference, prediction and the relevant algorithm. Section 4 encompasses the simulation studies that we propose. Lastly, Section 5 focuses on the empirical analysis of our proposed model.
Scholars usually believe that stock price or stock index returns are nonlinear, asymmetric, heavy-tailed, returns are generally uncorrelated. Aggregation characterises volatility, which was first found by Engle (1982) and Bollerslev (1986) in ARCH and GARCH models. The GARCH model is defined as follows:
rt=εt√ht,ht=α0+q∑i=1αir2t−i+p∑j=1βjht−j, | (2.1) |
where ht is the conditional heteroscedastic variance of return series rt.
Although there are many criteria of GARCH(p, q) models to find p and q, it is sufficient to apply the GARCH(1, 1) model to characterize the conditional volatilities.
The DeepAR model [33], illustrated in Figure 1, is a time series forecasting model that employs a deep autoregressive recurrent network architecture. Distinct from other time series forecasting models, DeepAR generates probabilistic predictions.
Consider a time series [x1,…,xt0,xt0+1,…,xT]:=x1:T. Given its past time series [x1,…,xt0−2,xt0−1]:=x1:t0−1, our objective is to predict the future time series [xt0,…,xt0+(T−1),xt0+T]:=xt0:t0+T. The DeepAR model constructs the conditional distribution PΘ(xt0:T|x1:t0−1) using a latent factor z, which is implemented by a deep recurrent network architecture. This conditional distribution, PΘ(xt0:T|x1:t0−1), comprises a product of likelihood factors (z)
PΘ(xt0:T∣x1:t0−1)=T∏t=t0PΘ(xt∣x1:t−1)=T∏t=t0p(xt∣θ(zt,Θ)). | (2.2) |
The likelihood p(xt|θ(zt)) is a fixed distribution with parameters determined by a function θ(zt,Θ) of the network output zt. As suggested by the model's authors, Gaussian likelihood is appropriate for real-valued data.
Forecasting the probability of volatility in finance and economics is an important problem. There are mainly two methods to tackle this. First, statistical models such as ARCH and the GARCH models are usually adopted. These models are specifically designed to capture the dynamic nature of volatility over time and help to predict future levels of volatility based on past patterns.
Another strategy involves using machine learning models, such as neural networks, which can analyze vast amounts of data and uncover patterns that may not be readily apparent to human analysts. A case in point is the DeepAR model is a series-to-series probabilistic forecasting model. The advantages of the DeepAR model are: it makes probabilistic forecasting and allows to introduce additional covariates. Due to these advantages, it can be used to predict financial volatility(ht) based on the series r2t. However, the DeepAR model usually assumes that p(xt|θ(zt)) (given in (2.2)) follows a Gaussian distribution, which may be unreasonable due to the non-negative, leptokurtic and heavy-tail characteristics of traditional financial volatility. To avoid this problem, people use the gaussian mixture distribution to describe the density of p(ln(xt)|θ(zt)), see references [34]. Motivated by the above results, this paper propose an improved mixture model: DeepAR-GMM-GARCH.
The conditional distribution of ln(ht) can be expressed as:
P(ln(ht)|r21:t−1,x1:t−1), | (3.1) |
where ht represents the future volatility at time t, [r1,...,rt−2,rt−1]:=r1:t−1 denotes the past return series during the [1:t−1] period, and x1:t−1 refers to the covariate, which is observable at all times. The past time horizon is represented by [1:t−1].
The proposed hybrid model assumes that the conditional density for logarithm of the volatility is given by p(ln(ht)|r21:t−1,x1:t−1), which includes a set of latent factors, denoted as zt. A recurrent neural network with hyperparameters(Θ1), specifically an LSTM, encodes the squared returns r2t, the input features xt and the previous latent factors zt−1, generating the updated latent factors zt. The likelihood p(ln(ht)|θ(zt)) follows a Gaussian mixture distribution with parameters determined by a function θ(zt,Θ2) of the network output zt. The network architecture of the DeepAR-GMM-GARCH model is depicted in Figure 2.
Due to the complex interplay between volatility and the factors that influence it, this paper's central model component declares that the volatility ht of a time series at time t is derived from the latent variable zt−1 at time t−1, the square of return r2t−1 and the covariates xt−1. p(ln(ht)|θ(zt)) follows a Gaussian mixture distribution composed of K components. In the empirical analysis, xt−1 will be substituted with a vector of extreme values. A nonlinear mapping function g is used to establish this relationship. The DeepAR-GMM-GARCH model proposed in this paper is as follows.
zt=g(zt−1,r2t−1,xt−1,Θ1),μk,t=log(1+exp(wTk,μzt+bk,μ)),σk,t=log(1+exp(wTk,σzt+bk,σ)),πk,t=log(1+exp(wTk,πzt+bk,π)),P(ln(ht)|zt,Θ2)∼K∑i=1πkN(μk,t,σk,t),rt=εt√ht,K∑i=1πk=1. | (3.2) |
The model can be viewed as a structure for nonlinear volatility prediction models since the conditional distribution of the perturbation εt in the model can be selected as N(0,1) and T(0,1,v). Consequently, this gives rise to two distinct models, referred to as DeepAR-GMM-GARCH and DeepAR-GMM-GARCH-t.
Assuming that the distribution of p(ln(ht)|θ(zt)) follows a Gaussian distribution, model (3.2) will be reduced to a more simple version:
zt=g(zt−1,r2t−1,xt−1,Θ1),μt=log(1+exp(wTμzt+bμ)),σt=log(1+exp(wTσzt+bσ)),P(ln(ht)|zt,Θ2)∼N(μt,σ2t),rt=εt√ht. | (3.3) |
For similarity, we call the above as DeepAR-GARCH model.
For a given time series, our goal is to estimate the parameters Θ1 of the LSTM cells and the parameters Θ2 in function θ which applies an affine transformation followed by a softplus activation. We employ a quasi-maximum likelihood estimation method with the likelihood function: Θ=argmax∑ilogp(~hi|Θ1,Θ2). Inferring from this likelihood function necessitates taking into account the latent variable zt.
The flowchart for the training algorithm of the model is shown below. First, we utilize the BIC criterion to identify the number of classifications, K, for all samples. Each data point is assigned a label from 1 to K, and each cluster k has its mean vector and covariance matrix. Based on these findings, we establish the initial πk as the proportion of data points labelled as k and set the initial mean vector μk and covariance matrix ∑k to the mean vector and covariance matrix within cluster k. As a result, we obtain the parameter values (θ=˜πk,0,˜μk,0,˜σ2k,0) from the initial cluster and use them to pre-train the DeepAR-GMM-GARCH model. This approach allows our model to converge quickly. Next, we partition the training sample data into multiple batches, select one sample from a batch and use the sample (r2t0−m,…,r2t0−1) as the input for the DeepAR-GMM-GARCH model. The model calculates a set of ˜πk,t,˜μk,t,˜σ2k,t, after which we sample from this Gaussian mixture model, compute the loss, and update the parameters through gradient descent. Since direct differentiation of the sampling is infeasible, we apply the reparameterization trick to adjust the model's parameters. We continue this training process until the end of the training cycle. Last, we input the training set sample into our trained model for prediction evaluation. The model sequentially calculates the parameters for both the latent variable and the mixed Gaussian model and then proceeds with sampling. Ultimately, we provide the prediction results derived from the sampling outcomes.
The training algorithm is shown in the Algorithm 1.
Algorithm 1 Training Procedure for DeepAR-GMM-GARCH Mixture Model |
1: for each batch do 2: for each t∈[t0−m,t0−1] do 3: if t is t0−m then 4: zt−1=0 5: else {t is not t0−m} 6: zt=g(zi−1,r2i−1,xt−1,Θ1) 7: end if 8: for each k∈[1,K] do 9: ˜μk,t=log(1+exp(wTk,μzt+bk,μ)) 10: ˜σ2k,t=log(1+exp(wTk,σzt+bk,σ)) 11: ˜πk,t=log(1+exp(wTk,πzt+bk,π)) 12: end for 13: sample ln(~ht)∼GMM(˜πk,t,˜μk,t,˜σ2k,t) 14: end for 15: compute Loss, model parameters Θ1, Θ2 adjust using gradient descent method. 16: end for |
During the training process, the definition of the loss function determines the prediction quality of the model. We use the the average negative loglikelihood function as the loss function Loss=Lh. In the GARCH model, we usually assume that εt obeys a Gaussian distribution or a Student distribution, two loss functions are as follows:
(1) When εt∼N(0,1), the loss function is:
Lh=−1NN∑t=1[log(˜ht−r2t2˜ht)]. | (3.4) |
(2) When εt∼t(0,1,v), the loss function is:
Lh=−1NN∑t=1[log(˜ht)+12(v+1)log(1+r2t˜ht(v−2))]. | (3.5) |
To calculate the above loss functions, we need to get samples for ~ht based on algorithm1 given in Section 3.2.1. In practice, if εt follows other distributions, based on the idea of QMLE, we still can use the loss function given in (3.4) see Liu and So, 2020.
Experiments are carried out on volatility inference using simulated time series. These series exhibit flexibility, with both volatility and mixing coefficients changing over time, as detailed below:
rt=εt√ht,εt∼N(0,1),p(ht|Ft−1)=η1,tϕ(μt,σ21,t)+η2,tϕ(μt,σ22,t),μt=a0+a1ht−1,σ21,t=α01+α11r2t−1+β1σ21,t−1,σ22,t=α02+α12r2t−1+β2σ22,t−1,π1,t=c0+c1ht−1,η1,t=exp(π1,t)/(1+exp(π1,t)), | (4.1) |
where Ft−1 denotes the information set through time t−1 and ϕ is the Gaussian density function. η1,t and η2,t are mixing coefficients of two gaussian distribution and satisfy: η2,t=(1−η1,t). When generating the simulation data, we set: α01=0.01, α11=0.1, β1=0.15, α02=0.04, α12=0.15, β2=0.82, c0=0.02,c1=0.90,a0=0.02,a1=0.6. The time series has initial values:r0=0.1,σ20=0, h0=0. The sample sizes of T = 500, 1000 and 1500 are considered, and the replication time is 1000.
For the series simulated from (4.1), we apply three models to forecast their volatility, namely, the GARCH model with ϵt∼N(0,1)(GARCH-n), the GARCH model withϵt∼T(0,1,v)(GARCH-t) and the DeepAR-GMM-GARCH model. Using MCMC sampling method, the degree of freedom for the Student-t distribution is determined to be 6. For the DeepAR-GMM-GARCH model, we set a recurrent neural network with three LSTM layers and 24 hidden nodes. For the input nodes m in Figure 2, we use the Grid Search Algorithm to find their optimal values. We use BIC rule to choose K based on the Mclust package [35]. Our model's hyperparameters are trained by the software Optuna, a commonly applied automatic hyperparameters optimization software.
Table 1 informs the three volatility forecasting models' in-sample errors (RMSE and MAE). The GARCH-n and GARCH-t show similar performance, and our DeepAR-GMM-GARCH model is outstanding; All the models get decreased RMSE as sample size increase. These results imply the proposed estimation can be asymptotically convergent. Table 2 reports the average out-of-sample errors of the three volatility forecasting models. Similar to the in-sample results, our DeepAR-GMM-GARCH model is superior to the GARCH-n and GARCH-t models.
sample size | Modle | RMSE | MAE |
T=500 | GARCH-n | 0.1364 | 0.0628 |
GARCH-t | 0.1211 | 0.0591 | |
DeepAR-GMM-GARCH | 0.1068_ | 0.0548_ | |
T=1000 | GARCH-n | 0.0621 | 0.0437 |
GARCH-t | 0.0578 | 0.0419 | |
DeepAR-GMM-GARCH | 0.0398_ | 0.0331_ | |
T=1500 | GARCH-n | 0.0604 | 0.0428 |
GARCH-t | 0.0652 | 0.0401 | |
DeepAR-GMM-GARCH | 0.0300_ | 0.0325_ | |
Note: Number of replications = 1000. |
sample size | Modle | RMSE | MAE |
T=500 | GARCH-n | 0.3564 | 0.3028 |
GARCH-t | 0.3271 | 0.3091 | |
DeepAR-GMM-GARCH | 0.2761_ | 0.2311_ | |
T=1000 | GARCH-n | 0.2619 | 0.2117 |
GARCH-t | 0.2318 | 0.2033 | |
DeepAR-GMM-GARCH | 0.2091_ | 0.1834_ | |
T=1500 | GARCH-n | 0.2241 | 0.2179 |
GARCH-t | 0.2213 | 0.1971 | |
DeepAR-GMM-GARCH | 0.1911_ | 0.1722_ | |
Note: Number of replications = 1000. |
Comprehensive stock index represents the average of the economic performance of the whole financial market. In this section, we study the China Shanghai Shenzhen index(CSI 300 index) daily OHLC data. The OHLC data contains daily high, low, open and close prices. Scholars have pointed out that combining the open, high, low and close prices can obtain more effective volatility estimates. Hence, We also intruduce OHLC data to our mixture model.
The data of the CSI 300 index studied in this paper are from January 4, 2010, to December 30, 2021, with a total of 2916 trading days. Let rt be the returns of the corresponding series, which are calculated using the closing price Ct series of the CSI300 index.
rt=100logCt+1Ct. | (5.1) |
The time series of rt and r2t are ploted in Figure 3. We can find that r2t displays a significant volatility clustering characteristic, and the amplitude of volatility is gradually decreasing.
The collected data is divided into training data and test data. Table 3 describes the statistical characteristics of training and test data, respectively. The mean of rt is relatively small, only 0.007465. The standard deviation is 2.251006, which displays that the degree of variation is large. The skewness is less than 0, and the kurtosis is greater than 3, indicating that the sequence is of left deviation and heavy tail. The test data has the same characteristics as the training data, such as data dispersion, large volatility, left deviation and higher protrusion than the normal distribution. This may imply that the normal distribution may not be suitable for our data, and other heavy-tailed distributions, such as t distribution or mixed normal distribution, could be more suitable.
Data Set | Period | Mean | Std. | Skew. | Kurt. |
training data | 04/01/2010 to 29/12/2017 | 0.007465 | 2.251006 | −0.755580 | 5.136707 |
test data | 02/01/2018 to 30/12/2021 | 0.019125 | 1.717302 | −0.427819 | 3.248347 |
Besides the close price(Ct), we also introduce the high price(Ht), the open price(Ot) and the low price(Lt). Define ut=(Ht−Ot)2, dt=(Lt−Ot)2, ct=(Ct−Ot)2.
The correlation matrix below (5.2) shows the correlation coefficients between ut, dt, ct and r2t. It can be found that there is large correlation coefficient between the pair (ut, r2t), (dt, r2t) and (ct, r2t). From a common sense, large values for ut, dt and ct, usually means large volatility(r2t). However, classical volatility models do not take such extreme values into account. Consequently, it is reasonable to use neural network together with ut, dt and ct to forecast volatility, because neural network can introduce additional covariates and capture the complex relation between different covariates.
[r2tutdtctr2t1.0000.3940.4750.716ut0.3941.0000.0120.556dt0.4750.0121.0000.690ct0.7160.5560.6901.000] | (5.2) |
This paper uses four evaluation indicators to measure the predictive performance of the model, they are: NMAE, HR, linear correlation coefficient and rank correlation coefficient, which is defined as follows:
NMAE=∑Nt=1|r2t+1−˜ht+1|∑Nt=1|r2t+1−r2t|, | (5.3) |
HR=1NN∑t=1θt,θt={1:(˜ht+1−r2t)(r2t+1−r2t)≥00: else , | (5.4) |
where N represents the number of predicted samples. Both NMAE and HR values range between 0 and 1. The smaller the values of these two indicators, the better the model's performance.
Scholars usually use high-frequency data volatility estimates as a proxy for actual volatility to evaluate forecasting models. We also use realized volatility(σ2RV,t) as a proxy for actual volatility, calculated by summing up the squares of intra-day returns every 5 minutes.
σ2RV,t=48∑i=1[logrt,i−logrt,i−1]2. | (5.5) |
We focus on the out-of-sample predictive performance of the models, the correlation between realized volatilities σ2RV,t+1 and predicted volatilities ˜ht+1 is measured only on the test set. We calculated Pearson's coefficient
r=∑Ni=1(σ2RV,t+1−σ2RV)(˜ht+1−˜h)√∑Ni=1(σ2RV,t+1−σ2RV)2√∑Ni=1(˜ht+1−˜h)2, | (5.6) |
where σ2RV and ˜h denote the respective mean values, and Spearman's rank order correlation coefficient rs. rs is also calculated using Eq (5.6). However, the actual volatilities are replaced by their ranks. Spearman's rank order correlation coefficient is considered more robust than Pearson's coefficient. r and rs are both between −1 and 1. A value of r(rs) around 0 means that the realized volatilities and predicted volatilities are uncorrelated.
Simulation experiments demonstrate that our proposed model exhibits greater prediction accuracy than the GARCH model. In this section, to highlight the advantages of our model, we compare it to the classic GARCH model, ANN-GARCH (an existing neural network GARCH model) and the DeepAR-GARCH model using empirical data.
Among the four models, the GARCH and ANN-GARCH models predict conditional volatility, whereas the DeepAR-GARCH and DeepAR-GMM-GARCH models provide probabilistic forecasts for conditional volatility. To facilitate comparison, we will calculate the mean and quantiles of the probability density function for conditional volatility derived from the DeepAR-GARCH and DeepAR-GMM-GARCH models.
The estimated parameters of the GARCH models(GARCH-n and GARCH-t) are summarized in Tables 4 and 5. For the GARCH-t model, the degrees of freedom parameter v is estimated at around 6. The GARCH-n and GARCH-t models are all nearly estimated with higher values of β1 and lower values of α1. The sum of α1 and β1 is almost 1, Which implies the sequence may be non-stationary. Therefore, our model without the stationary constraint is more suitable. The ANN-GARCH model employs a three-layer ANN structure, featuring two input nodes, 24 nodes for the hidden layer and a single output node. Likewise, the DeepAR-GARCH and DeepAR-GMM-GARCH models also have a three-layer design, consisting of 14 input nodes, 24 nodes for the hidden layer and an output layer with two output nodes and five output nodes.
Data Set | a0 | α1 | β1 |
CSI300 | 1.8365e−04 | 0.1000 | 0.8800 |
Data Set | a0 | α1 | β1 | ν |
CSI300 | 1.8103e−04 | 0.1000 | 0.8400 | 6.4625 |
Table 6 lists the performance of five volatility prediction models. In the in-sample study, the DeepAR-GMM-GARCH model has the smallest HR and loss, and the DeepAR-GARCH model has the smallest NMAE. The volatility prediction performance of the DeepAR-GMM-GARCH model is better than the traditional GARCH and DeepAR models.
Data Set | Model | Loss | NMAE | HR |
In-sample | GARCH-n | 1.401 | 0.763 | 0.704 |
GARCH-t | 1.331 | 0.761 | 0.637 | |
ANN-GARCH | 1.603 | 0.827 | 0.690 | |
DeepAR-GARCH | 1.541 | 0.748 | 0.717 | |
DeepAR-GMM-GARCH | 1.311 | 0.751 | 0.630 |
In Figure 4, we display a portion of the forecasting results from various models and compare them with r2t. As shown in (a), the forecasting results from the GARCH models differ from r2t. The GARCH models fail to capture significant changes in r2t. From (b), (c), it is evident that the neural network models capture the trend of r2t and more accurately predict large fluctuations, with the DeepAR-GMM-GARCH model demonstrating the best performance. In (d), we observe that the estimated 90% quantiles from the DeepAR-GMM-GARCH model appear to be closer to the observations (r2t).
To sum up, from the estimation results of Table 6 and the plots in Figure 4, it is shown that introducing the extreme values(ut, dt, ct) can help to improve the forecasting accuracy of the mixture volatility models. Hence the proposed approach is of particular practical value.
In the out-of-sample analysis, as discussed in Section 5.1, the test set comprises the time series of 972 trading days subsequent to the respective training set.
Table 7 presents the performance of the models using common error measures (loss function, NMAE and HR). The DeepAR-GARCH model attains a lower loss function value compared to the neural network models on the test set. The neural network models display lower NMAE and HR values than the GARCH model on the training set. The DeepAR-GMM-GARCH models exhibit the lowest NMAE and HR values on the test data set.
Data Set | Model | Loss | NMAE | HR |
Out-sample | GARCH-n | 2.320 | 0.917 | 0.868 |
GARCH-t | 2.008 | 0.915 | 0.859 | |
ANN-GARCH | 2.517 | 0.903 | 0.783 | |
DeepAR-GARCH | 1.916 | 0.929 | 0.801 | |
DeepAR-GMM-GARCH | 2.100 | 0.790 | 0.722 |
In Figure 5, we plot part of the forecasting results from the five models mentioned with out-sample data set and compare it with r2t. It can be seen from (a) that the GARCH models do not capture significant changes of r2t, the same as the in-sample results. For (b), (c), The neural network models capture most of the fluctuations of r2t well, the DeepAR-GMM-GARCH model performing the best.
From (d), we could find that The estimated 90% quantiles from the DeepAR-GMM-GARCH model seem to be more closely aligned with the observations (r2t).
Section 5.2 mentions that the linear correlation r and the rank correlation rs are two measures for comparing realized volatilities. The linear correlation r and the rank correlation rs between predicted and realized volatilities of the test set are reported in Table 8. On average, the DeepAR-GMM-GARCH model shows the best performance of all models. It obtains the highest rank correlation on the test set. Rank correlation is more robust than linear correlation since it detects correlations nonparametrically.
Out-sample | GARCH-n | GARCH-t | ANN-GARCH | DeepAR-GARCH | DeepAR-GMM-GARCH |
r | 0.381 | 0.420 | 0.490 | 0.473 | 0.504_ |
rs | 0.477 | 0.502 | 0.500 | 0.516 | 0.527_ |
This paper studies a mixture volatility forecasting model based on the autoregressive neural work and the GARCH model to obtain more precise forecasting for the conditional volatility model. The inference, loss functions and training algorithm of the mixture model are given. The simulation results show that our model performs better with less error than the classic GARCH models. The empirical study based on the CSI300 index shows that our model can significantly improve the forecasting accuracy with extreme values compared to the usual models.
Our research findings can offer valuable insights into the prediction of volatility uncertainty. In future studies, our model can be employed for various high-frequency volatility analysis, where it is anticipated to exhibit enhanced performance.
This work is partially supported by Guangdong Basic and Applied Basic Research Foundation (2022A1515010046) and Funding by Science and Technology Projects in Guangzhou (SL2022A03J00654).
The authors declare no conflict of interest.
[1] | S. Abou Zaid, On fuzzy subnear-rings and ideals, Fuzzy Set. Syst., 44 (1991), 139–146. |
[2] | B. Ahmad, A. Kharal, On Fuzzy soft sets, Adv. Fuzzy Syst., 2009 (2009), 1–6. |
[3] | U. Acar, B. Koyuncu, B. Tanay, Soft sets and soft rings, Comput. Math. Appl., 59 (2010), 3458–3463. |
[4] | J. Ahsan, K. Saifullah, M. Farid Khan, Fuzzy semirings, Fuzzy Set. Syst., 60 (1993), 309–320. |
[5] | J. Ahsan, Seminear-rings characterized by their S-ideals I, Proc. Japan Acad. Ser. A, Math. Sci., 71 (1995), 101–103. |
[6] | J. Ahsan, Seminear-rings characterized by their S-ideals II, Proc. Japan Acad. Ser. A, Math. Sci., 71 (1995), 111–113. |
[7] |
A. Aygünoglu, H. Aygün, Introduction to fuzzy soft groups, Comput. Math. Appl., 58 (2009), 1279–1286. doi: 10.1016/j.camwa.2009.07.047
![]() |
[8] | N. Çağman, S. Enginoğlu, F. Çitak, Fuzzy soft set theory and its applications, Iran. J. Fuzzy Syst., 8 (2011), 137–147. |
[9] | I. Chajda, H. Langer, Near-semirings and semirings with involution, Miskolc Math. Notes, 17 (2016), 801–810. |
[10] | A. Dey, M. Pal, Generalised multi-fuzzy soft set and its application in decision making, Pac. Sci. Rev. A: Nat. Sci. Eng., 17 (2015), 23–28. |
[11] |
V. N. Dixit, R. Kumar, N. Ajmal, On fuzzy rings, Fuzzy Set. Syst., 49 (1992), 205–213. doi: 10.1016/0165-0114(92)90325-X
![]() |
[12] |
F. Feng, Y. B. Jun, X. Zhao, Soft semirings, Comput. Math. Appl., 56 (2008), 2621–2628. doi: 10.1016/j.camwa.2008.05.011
![]() |
[13] |
W. L. Gau, D. J. Buehrer, Vague sets, IEEE Trans. Syst., Man Cybern., 23 (1993), 610–614. doi: 10.1109/21.229476
![]() |
[14] | Z. Haiyan, J. Jingjing, Fuzzy soft relation and its application in decision making, 7th International Conference on Modelling, Identification and Control (ICMIC), IEEE, 2015. |
[15] | W. G. V. Hoorn, B. Van Rootselaar, Fundamental notions in the theory of seminearrings, Compositio Math., 18 (1967), 65–78. |
[16] | E. İnan, M. A. Öztürk, Fuzzy soft rings and fuzzy soft ideals, Neural Comput. Appl., 21 (2012), 1–8. |
[17] |
C. Jenila, P. Dheena, Ideal theory in near-semirings and its applications to automata, Adv. Math.: Sci. J., 9 (2020), 4293–4302. doi: 10.37418/amsj.9.6.112
![]() |
[18] |
W. A. Khan, A. Rehman, A. Taouti, Soft near-semirings, AIMS Math., 5 (2020), 6464–6478. doi: 10.3934/math.2020417
![]() |
[19] | K. Koppula, K. B. Srinivas, K. S. Prasad, On prime strong ideals of a seminearring, Mat. Vesnik, 72 (2020), 243–256. |
[20] | K. V. Krishna, Near-semirings, theory and application, Ph.D thesis, IIT Delhi, New Delhi, India, 2005. |
[21] | K. V. Krishna, N. Chatterjee, A necessary condition to test the minimality of generalized linear sequential machines using the theory of near-semirings, Algebra Discrete Math., 4 (2005), 30–45. |
[22] |
A. Z. Khameneh, A. Kilman, Multi-attribute decision-making based on soft set theory: A systematic review, Soft Comput., 23 (2019), 6899–6920. doi: 10.1007/s00500-018-3330-7
![]() |
[23] | P. K. Maji, R. Biswas, A. R. Roy, Fuzzy soft sets, J. Fuzzy Math., 9 (2001), 589–602. |
[24] |
P. K. Maji, A. R. Roy, R. Biswas, An application of soft sets in a decision making problem, Comput. Math. Appl., 44 (2002), 1077–1083. doi: 10.1016/S0898-1221(02)00216-X
![]() |
[25] | P. Majumdar, S. K. Samanta, Generalised fuzzy soft sets, Comput. Math. Appl., 59 (2010), 1425–1432. |
[26] | D. Molodtsov, Soft set theory-First results, Comput. Math. Appl., 37 (1999), 19–31. |
[27] | M. A. Öztürk, E. İnan, Fuzzy soft subnear-rings and (∈,∈∨q)-fuzzy soft subnear-rings, Comput. Math. Appl., 63 (2012), 617–628. |
[28] | Z. Pawlak, Rough sets, Int. J. Inform. Comput. Sci., 11 (1982), 341–356. |
[29] | M. M. K. Rao, B. Venkateswarlu, Fuzzy soft k-ideals over semirings and fuzzy soft semiring homomorphism, J. Hyperstructures, 4 (2015), 93–116. |
[30] | B. V. Rootselaar, Algebraische kennzeichnung freier wortarithmetiken, Compos. Math., 15 (1963), 156–168. |
[31] |
A. R. Roy, P. K. Maji, A fuzzy soft set theoretic approach to decision making problems, J. Comput. Appl. Math., 203 (2007), 412–418. doi: 10.1016/j.cam.2006.04.008
![]() |
[32] | A. Sezgin, A. O. Atagün, E. Aygün, A note on soft near-rings and idealistic soft near-rings, Filomat, 25 (2011), 53–68. |
[33] | B. P. Varol, A. Aygunöglu, H. Aygün, On fuzzy soft rings, J. Hyperstructures, 1 (2012), 1–15. |
[34] |
H. J. Weinert, Semi-nearrings, semi-nearfields and their semigroup-theoretical background, Semigroup Forum, 24 (1982), 231–254. doi: 10.1007/BF02572770
![]() |
[35] |
L. A. Zadeh, Fuzzy sets, Infor. Control, 8 (1965), 338–353. doi: 10.1016/S0019-9958(65)90241-X
![]() |
sample size | Modle | RMSE | MAE |
T=500 | GARCH-n | 0.1364 | 0.0628 |
GARCH-t | 0.1211 | 0.0591 | |
DeepAR-GMM-GARCH | 0.1068_ | 0.0548_ | |
T=1000 | GARCH-n | 0.0621 | 0.0437 |
GARCH-t | 0.0578 | 0.0419 | |
DeepAR-GMM-GARCH | 0.0398_ | 0.0331_ | |
T=1500 | GARCH-n | 0.0604 | 0.0428 |
GARCH-t | 0.0652 | 0.0401 | |
DeepAR-GMM-GARCH | 0.0300_ | 0.0325_ | |
Note: Number of replications = 1000. |
sample size | Modle | RMSE | MAE |
T=500 | GARCH-n | 0.3564 | 0.3028 |
GARCH-t | 0.3271 | 0.3091 | |
DeepAR-GMM-GARCH | 0.2761_ | 0.2311_ | |
T=1000 | GARCH-n | 0.2619 | 0.2117 |
GARCH-t | 0.2318 | 0.2033 | |
DeepAR-GMM-GARCH | 0.2091_ | 0.1834_ | |
T=1500 | GARCH-n | 0.2241 | 0.2179 |
GARCH-t | 0.2213 | 0.1971 | |
DeepAR-GMM-GARCH | 0.1911_ | 0.1722_ | |
Note: Number of replications = 1000. |
Data Set | Period | Mean | Std. | Skew. | Kurt. |
training data | 04/01/2010 to 29/12/2017 | 0.007465 | 2.251006 | −0.755580 | 5.136707 |
test data | 02/01/2018 to 30/12/2021 | 0.019125 | 1.717302 | −0.427819 | 3.248347 |
Data Set | a0 | α1 | β1 |
CSI300 | 1.8365e−04 | 0.1000 | 0.8800 |
Data Set | a0 | α1 | β1 | ν |
CSI300 | 1.8103e−04 | 0.1000 | 0.8400 | 6.4625 |
Data Set | Model | Loss | NMAE | HR |
In-sample | GARCH-n | 1.401 | 0.763 | 0.704 |
GARCH-t | 1.331 | 0.761 | 0.637 | |
ANN-GARCH | 1.603 | 0.827 | 0.690 | |
DeepAR-GARCH | 1.541 | 0.748 | 0.717 | |
DeepAR-GMM-GARCH | 1.311 | 0.751 | 0.630 |
Data Set | Model | Loss | NMAE | HR |
Out-sample | GARCH-n | 2.320 | 0.917 | 0.868 |
GARCH-t | 2.008 | 0.915 | 0.859 | |
ANN-GARCH | 2.517 | 0.903 | 0.783 | |
DeepAR-GARCH | 1.916 | 0.929 | 0.801 | |
DeepAR-GMM-GARCH | 2.100 | 0.790 | 0.722 |
Out-sample | GARCH-n | GARCH-t | ANN-GARCH | DeepAR-GARCH | DeepAR-GMM-GARCH |
r | 0.381 | 0.420 | 0.490 | 0.473 | 0.504_ |
rs | 0.477 | 0.502 | 0.500 | 0.516 | 0.527_ |
sample size | Modle | RMSE | MAE |
T=500 | GARCH-n | 0.1364 | 0.0628 |
GARCH-t | 0.1211 | 0.0591 | |
DeepAR-GMM-GARCH | 0.1068_ | 0.0548_ | |
T=1000 | GARCH-n | 0.0621 | 0.0437 |
GARCH-t | 0.0578 | 0.0419 | |
DeepAR-GMM-GARCH | 0.0398_ | 0.0331_ | |
T=1500 | GARCH-n | 0.0604 | 0.0428 |
GARCH-t | 0.0652 | 0.0401 | |
DeepAR-GMM-GARCH | 0.0300_ | 0.0325_ | |
Note: Number of replications = 1000. |
sample size | Modle | RMSE | MAE |
T=500 | GARCH-n | 0.3564 | 0.3028 |
GARCH-t | 0.3271 | 0.3091 | |
DeepAR-GMM-GARCH | 0.2761_ | 0.2311_ | |
T=1000 | GARCH-n | 0.2619 | 0.2117 |
GARCH-t | 0.2318 | 0.2033 | |
DeepAR-GMM-GARCH | 0.2091_ | 0.1834_ | |
T=1500 | GARCH-n | 0.2241 | 0.2179 |
GARCH-t | 0.2213 | 0.1971 | |
DeepAR-GMM-GARCH | 0.1911_ | 0.1722_ | |
Note: Number of replications = 1000. |
Data Set | Period | Mean | Std. | Skew. | Kurt. |
training data | 04/01/2010 to 29/12/2017 | 0.007465 | 2.251006 | −0.755580 | 5.136707 |
test data | 02/01/2018 to 30/12/2021 | 0.019125 | 1.717302 | −0.427819 | 3.248347 |
Data Set | a0 | α1 | β1 |
CSI300 | 1.8365e−04 | 0.1000 | 0.8800 |
Data Set | a0 | α1 | β1 | ν |
CSI300 | 1.8103e−04 | 0.1000 | 0.8400 | 6.4625 |
Data Set | Model | Loss | NMAE | HR |
In-sample | GARCH-n | 1.401 | 0.763 | 0.704 |
GARCH-t | 1.331 | 0.761 | 0.637 | |
ANN-GARCH | 1.603 | 0.827 | 0.690 | |
DeepAR-GARCH | 1.541 | 0.748 | 0.717 | |
DeepAR-GMM-GARCH | 1.311 | 0.751 | 0.630 |
Data Set | Model | Loss | NMAE | HR |
Out-sample | GARCH-n | 2.320 | 0.917 | 0.868 |
GARCH-t | 2.008 | 0.915 | 0.859 | |
ANN-GARCH | 2.517 | 0.903 | 0.783 | |
DeepAR-GARCH | 1.916 | 0.929 | 0.801 | |
DeepAR-GMM-GARCH | 2.100 | 0.790 | 0.722 |
Out-sample | GARCH-n | GARCH-t | ANN-GARCH | DeepAR-GARCH | DeepAR-GMM-GARCH |
r | 0.381 | 0.420 | 0.490 | 0.473 | 0.504_ |
rs | 0.477 | 0.502 | 0.500 | 0.516 | 0.527_ |