Research article Special Issues

Back propagation neural network model for medical expenses in patients with breast cancer

  • Received: 03 March 2021 Accepted: 09 April 2021 Published: 27 April 2021
  • Objective 

    Breast cancer seriously endangers women's life and health, and brings huge economic burden to the family and society. The aim of this study was to analyze the medical expenses and influencing factors of breast cancer patients, and provide theoretical basis for reasonable control of medical expenses of breast cancer patients.

    Methods 

    The medical expenses and related information of all female breast cancer patients diagnosed in our hospitals from 2017 to 2019 were collected. Through SSPS Clementine 12.0 software, the back propagation (BP) neural network model and multiple linear regression model were constructed respectively, and the influencing factors of medical expenses of breast cancer patients in the two models were compared.

    Results 

    In the study of medical expenses of breast cancer patients, the prediction error of BP neural network model is less than that of multiple linear regression model. At the same time, the results of the two models showed that the length of stay and region were the top two factors affecting the medical expenses of breast cancer patients.

    Conclusion 

    Compared with multiple linear regression model, BP neural network model is more suitable for the analysis of medical expenses in patients with breast cancer.

    Citation: Feiyan Ruan, Xiaotong Ding, Huiping Li, Yixuan Wang, Kemin Ye, Houming Kan. Back propagation neural network model for medical expenses in patients with breast cancer[J]. Mathematical Biosciences and Engineering, 2021, 18(4): 3690-3698. doi: 10.3934/mbe.2021185

    Related Papers:

    [1] Xiaoling Chen, Xingfa Zhang, Yuan Li, Qiang Xiong . Daily LGARCH model estimation using high frequency data. Data Science in Finance and Economics, 2021, 1(2): 165-179. doi: 10.3934/DSFE.2021009
    [2] Paarth Thadani . Financial forecasting using stochastic models: reference from multi-commodity exchange of India. Data Science in Finance and Economics, 2021, 1(3): 196-214. doi: 10.3934/DSFE.2021011
    [3] Kexian Zhang, Min Hong . Forecasting crude oil price using LSTM neural networks. Data Science in Finance and Economics, 2022, 2(3): 163-180. doi: 10.3934/DSFE.2022008
    [4] Moses Khumalo, Hopolang Mashele, Modisane Seitshiro . Quantification of the stock market value at risk by using FIAPARCH, HYGARCH and FIGARCH models. Data Science in Finance and Economics, 2023, 3(4): 380-400. doi: 10.3934/DSFE.2023022
    [5] Samuel Asante Gyamerah, Collins Abaitey . Modelling and forecasting the volatility of bitcoin futures: the role of distributional assumption in GARCH models. Data Science in Finance and Economics, 2022, 2(3): 321-334. doi: 10.3934/DSFE.2022016
    [6] Alejandro Rodriguez Dominguez, Om Hari Yadav . A causal interactions indicator between two time series using extreme variations in the first eigenvalue of lagged correlation matrices. Data Science in Finance and Economics, 2024, 4(3): 422-445. doi: 10.3934/DSFE.2024018
    [7] Wojciech Kurylek . Are Natural Language Processing methods applicable to EPS forecasting in Poland?. Data Science in Finance and Economics, 2025, 5(1): 35-52. doi: 10.3934/DSFE.2025003
    [8] Nitesha Dwarika . The risk-return relationship in South Africa: tail optimization of the GARCH-M approach. Data Science in Finance and Economics, 2022, 2(4): 391-415. doi: 10.3934/DSFE.2022020
    [9] Mohamed F. Abd El-Aal . Analysis Factors Affecting Egyptian Inflation Based on Machine Learning Algorithms. Data Science in Finance and Economics, 2023, 3(3): 285-304. doi: 10.3934/DSFE.2023017
    [10] Xiaozheng Lin, Meiqing Wang, Choi-Hong Lai . A modification term for Black-Scholes model based on discrepancy calibrated with real market data. Data Science in Finance and Economics, 2021, 1(4): 313-326. doi: 10.3934/DSFE.2021017
  • Objective 

    Breast cancer seriously endangers women's life and health, and brings huge economic burden to the family and society. The aim of this study was to analyze the medical expenses and influencing factors of breast cancer patients, and provide theoretical basis for reasonable control of medical expenses of breast cancer patients.

    Methods 

    The medical expenses and related information of all female breast cancer patients diagnosed in our hospitals from 2017 to 2019 were collected. Through SSPS Clementine 12.0 software, the back propagation (BP) neural network model and multiple linear regression model were constructed respectively, and the influencing factors of medical expenses of breast cancer patients in the two models were compared.

    Results 

    In the study of medical expenses of breast cancer patients, the prediction error of BP neural network model is less than that of multiple linear regression model. At the same time, the results of the two models showed that the length of stay and region were the top two factors affecting the medical expenses of breast cancer patients.

    Conclusion 

    Compared with multiple linear regression model, BP neural network model is more suitable for the analysis of medical expenses in patients with breast cancer.



    Estimating foreign exchange rate (FX) volatility is a core risk management activity for financial institutions, corporates and regulators. The subject has been extensively investigated among both practitioners and scientific researchers, and several alternative models exist. Among the most prominent are the models belonging to GARCH and stochastic volatility classes. However, the true value of volatility cannot be directly observed. Hence, volatility must be estimated, inevitably with error. This constitutes a fundamental problem in implementing parametric models, especially in the context of high-frequency data. Andersen and Bollerslev (1998) proposed using realized volatility, as derived from high-frequency data, to accurately measure the true latent integrated volatility. This approach has gained attention for volatility modeling in markets where tick-level data is available (Andersen et al., 2013). Andersen et al. (2003) suggest fractionally integrated ARFIMA models in this context. Still, the long-memory HAR (heterogeneous autoregressive) model of Corsi (2009) is arguably the most widely used to capture the high persistence typically observed in realized volatility of financial prices. The HAR model is relatively simple and easy to estimate. In empirical applications, the model tends to perform better than GARCH and stochastic volatility models possibly due to the sensitivity of tightly parameterized volatility models to minor model misspecifications (Sizova., 2011). Although realized volatility (RV) is a consistent estimator of the true latent volatility, it is subject to measurement error in empirical finite samples. Hence, RV will not only reflect the true latent integrated volatility (IV), but also additional measurement errors. Bollerslev et al. (2016) propose utilizing higher-order realized moments of the realized distribution to approximate these measurement errors. More specifically, Bollerslev et al. (2016) propose the HARQ-model, which augments the HAR model with realized quarticity as an additional covariate.

    The empirical performance of the HARQ and related extensions has been extensively studied. The focus has predominantly been on equity markets. A majority of the studies analyze U.S. data; see Bollerslev et al. (2016); Clements, A. and Preve, D. (2021); Pascalau and Poirier (2023); Andersen et al. (2023) and others. Liu et al. (2018) and Wang et al. (2020) investigate Chinese equity markets, whereas Liang et al. (2022); Ma et al. (2019) analyse international data. Bitcoin and electricity markets have attracted some attention; see, for instance, Shen et al. (2020); Qieu et al. (2021), and Qu et al. (2018).

    Empirical applications of the HARQ model in the context of foreign exchange rate risk are sparse. Lyócsa et al. (2016) find that the standard HAR model rarely is outperformed by less parsimonious specifications on CZKEUR, PLZEUR, and HUFEUR data. Plíhal et al. (2021) and Rokicka and Kudła (2020) estimate the HARQ model on EURUSD and EURGBP data, respectively. Their focus is different from ours, as they investigate the incremental predictive power of implied volatility for a broad class of HAR models. In a similar vein, Götz (2023) and Lyócsa et al. (2024) utilize the HARQ model for the purpose of estimating foreign exchange rate tail risk.

    Using updated tick-level data from two major currency pairs, EURUSD and USDJPY, this paper documents the relevance of realized quarticity for improving volatility estimates across varying forecasting horizons. These results are robust across estimation windows, evaluation metrics, and model specifications.

    We use high-frequency intraday ticklevel spot data, publicly available at DukasCopy* The sample period is 1. January 2010 to 31. December 2022. Liu et al. (2015) investigate the optimal intraday sampling frequency across a significant number of asset classes and find that 5-min intervals usually outperform others. Hence, as common in the literature, we estimate the realized volatility from 5-minute returns.

    *This data source is also used by Plíhal et al. (2021), Risstad et al. (2023) and Lyócsa et al. (2024), among others.

    To filter tick-level data, we follow a two-step cleaning procedure based on the recommendations by Barndorff-Nielsen et al. (2009). Initially, we eliminate data entries that exhibit any of the following issues: (i) absence of quotes, (ii) a negative bid-ask spread, (iii) a bid-ask spread exceeding 50 times the median spread of the day, or (iv) a mid-quote deviation beyond ten mean absolute deviations from a centered mean (computed excluding the current observation from a window of 25 observations before and after). Following this, we calculate the mid-quotes as the average of the bid and ask quotes and then resample the data at 5-minute intervals.

    We compute the consistent estimator of the true latent time-t variance from

    RV2tMt=1r2t,i, (1)

    where M=1/Δ, and the Δ-period intraday return is rt,ilog(St1+i×Δ)log(St1+(i1)×Δ), where S is the spot exchange rate. Analogously, the multi(h)-period realized variance estimator is

    RV2t1,th=1hhi=1RV2th. (2)

    Setting h=5 and h=22 yields weekly and monthly estimates, respectively.

    Table 1 displays descriptive statistics for daily realized variances, as computed from (1).

    Table 1.  Realized Variance (daily).
    Min Mean Median Max ρ1
    EURUSD 0.1746 3.0606 2.2832 59.4513 0.5529
    USDJPY 0.1018 3.2460 2.0096 168.0264 0.2860
    The table contains summary statistics for the daily RV s for EURUSD and USDJPY. ρ1 is the standard first order autocorrelation coefficient. Sample period: 1. January 2010 to 31. December 2022.

     | Show Table
    DownLoad: CSV

    To represent the long-memory dynamic dependencies in volatility, Corsi (2009) proposed using daily, weekly, and monthly lags of realized volatility as covariates. The original HAR model is defined as

    RVt=β0+β1RVt1+β2RVt1t5+β3RVt1t22+ut, (3)

    where RV is computed from (1) and (2). If the variables in (2.2) contain measurement errors, the beta coefficients will be affected. Bollerslev et al. (2016) suggests two measures to alleviate this. First, they include a proxy for measurement error as an additional explanatory variable. Furthermore, they directly adjust the coefficients in proportion to the magnitude of the measurement errors:

    RVt=β0+(β1+β1QRQ1/2t1)β1,tRVt1+(β2+β2QRQ1/2t1t5)β2,tRVt1t5+(β3+β3QRQ1/2t1t22)β3,tRVt1t22+ut,

    where realized quarticity RQ is defined as

    RQtM3Mi=1r4t,i (4)

    The full HARQ model in (2.2) adjusts the coefficients on all lags of RV. A reasonable conjecture is that measurement errors in realized volatilities tend to diminish at longer forecast horizons, as these errors are diversified over time. This suggests that measurement errors in daily lagged realized volatilities are likely to be relatively more important. Motivated by this Bollerslev et al. (2016) specify the HARQ model as

    RVt=β0+(β1+β1QRQ1/2t1)β1,tRVt1+β2RVt1t5+β3RVt1t22+ut. (5)

    Although there is no reason to expect that autoregressive models of order one will be able to accurately capture long memory in realized volatility, we estimate AR(1) models as a point of reference. The AR and ARQ models are defined as

    RVt=β0+β1RVt1+ut. (6)

    and

    RVt=β0+(β1+β1QRQ1/2t1)β1,tRVt1+ut. (7)

    in equations (6) and (7), respectively.

    Due to noisy data and related estimation errors, forecasts from realized volatility models might occasionally appear as unreasonably high or low. Thus, in line with Swanson et al. (1997) and Bollerslev et al. (2016), we filter forecasts from all models so that any forecast outside the empirical distribution of the estimation sample is replaced by the sample mean.

    Table 2 reports in-sample parameter estimates for the ARQ, HARQ, and HARQ-F models, along with the benchmark AR and ARQ models, for one-day ahead EURUSD (upper panel) and USDJPY (lower panel) volatility forecasts. Robust standard errors (s.e.) are computed as proposed by White (1980). R2, MSE, and QLIKE are displayed at the bottom of each panel.

    Table 2.  In-sample estimation results, one-day-ahead volatility forecasts.
    EURUSD AR HAR ARQ HARQ HARQ-F
    β0 1.3663 0.3961 0.7428 0.2785 0.0651
    s.e. 0.1843 0.0598 0.0969 0.0586 0.0685
    β1 0.5530 0.2364 0.7903 0.4349 0.3740
    s.e. 0.0653 0.0730 0.0388 0.0754 0.0792
    β2 0.3767 0.3072 0.4613
    s.e. 0.0717 0.0697 0.1031
    β3 0.2572 0.1850 0.2398
    s.e. 0.0532 0.0515 0.0822
    β1Q 2.4914 1.3708 0.9710
    s.e. 0.3377 0.1939 0.2266
    β2Q 1.7578
    s.e. 0.8706
    β3Q 3.9819
    s.e. 1.1618
    R2 0.3058 0.3956 0.3685 0.4101 0.4166
    MSE 6.3005 5.4852 5.7315 5.3538 5.2950
    QLIKE 0.1647 0.1230 0.1540 0.1217 0.1199
    USDJPY AR HAR ARQ HARQ HARQ-F
    β0 2.3073 1.0682 1.3537 0.7811 0.5218
    s.e. 0.2362 0.1381 0.2207 0.1429 0.1328
    β1 0.2854 0.1819 0.6180 0.5177 0.4416
    s.e. 0.0804 0.0806 0.0853 0.1106 0.1260
    β2 0.1441 0.0542 0.2345
    s.e. 0.0585 0.0543 0.1072
    β3 0.3443 0.2188 0.2228
    s.e. 0.0499 0.0493 0.0658
    β1Q 0.2295 0.1967 0.1526
    s.e. 0.0318 0.0386 0.0476
    β2Q 0.2296
    s.e. 0.0849
    β3Q 0.3573
    s.e. 0.1142
    R2 0.0814 0.1154 0.1489 0.1581 0.1642
    MSE 33.6096 32.3668 31.1409 30.8063 30.5818
    QLIKE 0.3214 0.2561 0.2663 0.2377 0.2242
    Note: The table contains in-sample parameter estimates and corresponding standard errors (White, 1980), together with R2. MSE and QLIKE computed from (12) and (13). Superscripts *, **, and *** represent statistical significance in a two-sided t-test at 1%, 5% and 10% levels, respectively.

     | Show Table
    DownLoad: CSV

    The coefficients β1Q are negative and exhibit strong statistical significance, aligning with the hypothesis that RQ represents time-varying measurement error. When comparing the autoregressive (AR) coefficient of the AR model to the autoregressive parameters in the ARQ model, the AR coefficient is markedly lower, reflecting the difference in in persistence between the models.

    In the comparative analysis of the HAR and HARQ models applied to both currency pairs, the HAR model assigns more emphasis to the weekly and monthly lags, which are generally less sensitive to measurement errors. In contrast, the HARQ model typically assigns a higher weight to the daily lag. However, when measurement errors are substantial, the HARQ model reduces the weight on the daily lag to accommodate the time-varying nature of the measurement errors in the daily realized volatility (RV). The flexible version of this model, the HARQ-F, allows for variability in the weekly and monthly lags, resulting in slightly altered parameters compared to the standard HARQ model. Notably, the coefficients β2Q and β3Q in the HARQ-F model are statistically significant, and this model demonstrates a modest enhancement in in-sample fit relative to the HARQ model.

    To further assess the out-of-sample performance of the HARQ model, we consider three alternative HAR type specifications. More specifically, we include both the HAR-with-Jumps (HAR-J) and the Continuous-HAR (CHAR) proposed by Andersen et al. (2007), as well as the SHAR model proposed by Patton and Sheppard (2015), in the forecasting comparisons. Based on the Bi-Power Variation (BPV) measure of Barndorff-Nielsen and Shephard (2004), HAR-J and CHAR decompose the total variation into a continuous and a discontinuous (jump) part.

    The HAR-J model augments the standard HAR model with a measure of the jump variation;

    RVt=β0+β1RVt1+β2RVt1t5+β3RVt1t22+βJJt1+ut, (8)

    where Jtmax[RVtBPVt,0], and the BPV measure is defined as,

    BPVtμ21M1i=1|rt,i||rt,i+1|, (9)

    with μ1=2/π=E(|Z|), and Z is a standard normal random variable.

    The CHAR model includes measures of the continuous component of the total variation as covariates;

    RVt=β0+β1BPVt1+β2BPVt1t5+β3BPVt1t22+ut. (10)

    Inspired by the semivariation measures of Barndorff-Nielsen et al. (2008), Patton and Sheppard (2015) propose the SHAR model, which, in contrast to the HAR model, effectively allows for asymmetric responses in volatility forecasts from negative and positive intraday returns. More specifically, when RVtMi=1r2t,iI{rt,i<0} and RV+tMi=1r2t,iI{rt,i>0}, the SHAR model is defined as:

    RVt=β0+β+1RV+t1+β1RVt1+β2RVt1t5+β3RVt1t22+ut. (11)

    To evaluate model performance, we consider the mean squared error (MSE) and the QLIKE loss, which, according to Patton (2011), both are robust to noise. MSE is defined as

    MSE(RVt,Ft)(RVtFt)2, (12)

    where Ft refers to the one-period direct forecast. QLIKE is defined as

    QLIKE(RVt,Ft)RVtFtln(RVtFt)1. (13)

    Table 3 contains one-day-ahead forecasts for EURUSD and USDJPY. The table reports model performance, expressed as model loss normalized by the loss of the HAR model. Each row reflects a combination of estimation window and loss function. The lowest ratio on each row, highlighting the best, performing model, is in bold. We evaluate the models using both a rolling window (RW) and an expanding window (EW). In both cases, forecasts are derived from model parameters re-estimated each day with a fixed length RW comprised of the previous 1000 days, as well as an EW using all of the available observations. The sample sizes for EW thus range from 1000 to 3201 days. The results are consistent in that the HARQ-F model is the best performer for both currency pairs and across loss functions and estimation windows. The HARQ model is closest to HARQ-F. Neither HAR-J, CHAR, nor SHAR appear to consistently improve upon the standard HAR model.

    Table 3.  Out-of-sample forecast losses, one-day-ahead volatility forecasts.
    EURUSD AR HAR HAR-J CHAR SHAR ARQ HARQ HARQ-F
    MSE-RW 1.1483 1.0000 1.0088 0.9945 1.0080 1.0311 0.9759 0.9655*
    MSE-EW 1.1619 1.0000 0.9984 0.9908 1.0050 1.02660 0.9742 0.9720*
    QLIKE-RW 1.3153 1.0000 0.9907 0.9813 1.0078 1.1575 0.9767 0.9582*
    QLIKE-EW 1.3915 1.0000 0.9907 0.9944 1.0052 1.1927 0.9952 0.9721**
    USDJPY AR HAR HAR-J CHAR SHAR ARQ HARQ HARQ-F
    MSE-RW 1.0502 1.0000 1.0053 0.9979 1.0238 0.8907 0.8885 0.8832*
    MSE-EW 1.0475 1.0000 1.0243 1.0133 1.0515 0.9558 0.9446 0.9376*
    QLIKE-RW 1.2320 1.0000 1.0748 0.9944 0.9811 0.9482 0.8824 0.8667*
    QLIKE-EW 1.3066 1.0000 1.0023 0.9800 0.9941 1.0039 0.8949 0.8519*
    Note: Model performance, expressed as model loss normalized by the loss of the HAR model. Each row reflects a combination of estimation window and loss function. Ratio for the best performing model on each row in bold. Corresponding asterix * and ** denote 1% and 5% confidence levels from Diebold-Mariano test for one-sided tests of superior performance of the best performing model compared to the HAR model.

     | Show Table
    DownLoad: CSV

    Judging from Table 3, it is beneficial to include RQ as an explanatory variable when RV is measured inaccurately. However, precise measurement of RV becomes more difficult when RV is high, inducing a positive correlation between RV and RQ. At the same time, high RV often coincides with jumps. To clarify whether the performance of RQ-based models is due to jump dynamics, Table 4 further segments the results in Table 3 into forecasts for days when the previous day's RQ was very high (Top 5% RQ, Table 4b) and the remaining sample (Bottom 95% RQ, Table 4a). As this breakdown shows, the RQ-based models perform relatively well also during periods of non-extreme heteroscedasticity of RQ.

    Table 4.  Stratified one-day-ahead out-of-sample forecast losses.
    (a) Bottom 95% RQ
    EURUSD AR HAR HAR-J CHAR SHAR ARQ HARQ HARQ-F
    MSE-RW 1.1156 1.0000 0.9937 0.9907 1.0021 1.0636 0.9925 0.9794
    MSE-IW 1.1175 1.0000 0.9887 0.9885 1.0020 1.0711 0.9967 0.9866
    QLIKE-RW 1.3299 1.0000 0.9975 0.9855 1.0071 1.1598 0.9745 0.9555
    QLIKE-IW 1.4108 1.0000 0.9956 0.9980 1.0055 1.1995 0.9944 0.9720
    heightUSDJPY AR HAR HAR-J CHAR SHAR ARQ HARQ HARQ-F
    MSE-RW 1.0330 1.0000 1.0146 0.9984 0.9940 0.9592 0.9526 0.9495
    MSE-IW 1.0590 1.0000 0.9962 0.9925 1.0001 0.9849 0.9681 0.9601
    QLIKE-RW 1.2507 1.0000 1.1353 0.9877 0.9829 0.9542 0.8797 0.8450
    QLIKE-IW 1.3266 1.0000 0.9883 0.9734 0.9993 1.0100 0.8887 0.8434
    (b) Top 5% RQ
    EURUSD AR HAR HAR-J CHAR SHAR ARQ HARQ HARQ-F
    MSE-RW 1.2276 1.0000 1.0453 1.0036 1.0225 0.9523 0.9355 0.9316
    MSE-IW 1.2642 1.0000 1.0206 0.9960 1.0121 0.9218 0.9224 0.9382
    QLIKE-RW 1.0876 1.0000 0.8851 0.9152 1.0186 1.1223 1.0116 0.9996
    QLIKE-IW 1.0902 1.0000 0.9141 0.9389 1.0006 1.0856 1.0081 0.9745
    USDJPY AR HAR HAR-J CHAR SHAR ARQ HARQ HARQ-F
    MSE-RW 1.0674 1.0000 1.0025 0.9974 1.0535 0.9425 0.8700 0.8518
    MSE-IW 1.0347 1.0000 1.0566 1.0365 1.1090 0.9246 0.9183 0.9126
    QLIKE-RW 1.0202 1.0000 1.5755 1.0697 0.9601 0.8803 0.9135 0.9999
    QLIKE-IW 1.0544 1.0000 1.1789 1.0628 0.9279 0.9278 0.9730 0.9588
    Note: The table segments the results in Table 3 according to RQ. The bottom panel shows the ratios for days following a value of RQ in the top 5%. The top panel shows the results for the remaining 95% of sample. Ratio for the best performing model on each row in bold.

     | Show Table
    DownLoad: CSV

    In practitioner applications, longer forecasts than one day are often of interest. We now extend our analysis to weekly and monthly horizons, using direct forecasts. The daily forecast analysis in subsubsection 3.2.1 indicates the lag order of RQ plays an important role in forecast accuracy. Hence, following Bollerslev et al. (2016), we consider the HARQ-h model, and adjust the lag corresponding to the specific forecast horizon only. Specifically, for the weekly and monthly forecasts analysed here, the relevant HARQ-h specifications become

    RVt+4t=β0+β1RVt1+(β2+β2QRQ1/2t1t5)β2,tRVt1t5+β3RVt1t22+ut (14)

    and

    RVt+21t=β0+β1RVt1+β2RVt1t5+ut,+(β3+β3QRQ1/2t1t22)β3,tRVt1t22+ut, (15)

    respectively.

    Table 5 presents in-sample parameter estimates across model specifications. The patterns observed here closely resemble those of the daily estimates detailed in Table 2. All coefficients on RQ (β1Q,β2Q,β3Q) are negative, except for the (h=22) lag statistically significant. This indicates that capturing measurement errors is relevant also for forecast horizons beyond one day. The HARQ model consistently allocates greater weight to the daily lag compared to the standard HAR model. Similarly, the HARQ-h model predominantly allocates its weight towards the time-varying lag. The weights of the HARQ-F model on the different lags are relatively more stable when compared to the HARQ-h model.

    Table 5.  In-sample weekly and monthly model estimates.
    (a) EURUSD
    Weekly Monthly
    AR ARQ HAR HARQ HARQ-F HARQ-h AR ARQ HAR HARQ HARQ-F HARQ-h
    β0 0.8646 0.2634 0.5680 0.4758 -0.0250 0.2275 1.6388 0.9642 0.9269 0.8452 0.2328 0.2153
    s.e. 0.1345 0.0927 0.0997 0.0882 0.0895 0.0861 0.1806 0.1840 0.2246 0.2099 0.2080 0.2093
    β1 0.7168 0.9620 0.1194 0.2752 0.1836 0.1181 0.4616 0.7373 0.0717 0.2097 0.1131 0.0646
    s.e. 0.0480 0.0400 0.0264 0.0395 0.0269 0.0214 0.0564 0.0616 0.0205 0.0401 0.0248 0.0185
    β2 0.3938 0.3395 0.5777 0.7635 0.2091 0.1606 0.3706 0.2176
    s.e. 0.0887 0.0881 0.1282 0.1139 0.0587 0.0554 0.0962 0.0563
    β3 0.3008 0.2440 0.3131 0.0876 0.4163 0.3661 0.5153 0.7179
    s.e. 0.0880 0.0817 0.1275 0.0940 0.1186 0.1174 0.1498 0.1106
    β1Q 5.4876 1.0749 0.4728 6.1534 0.9499 0.3246
    s.e. 0.4817 0.1377 0.1005 0.9900 0.1815 0.0846
    β2Q 2.7357 4.9739 2.3111
    s.e. 0.9302 0.7181 0.8020
    β3Q 5.6441 7.8467 10.9979
    s.e. 1.4540 2.1071 1.9082
    R2 0.5138 0.5642 0.5453 0.5604 0.5843 0.5756 0.4297 0.5191 0.5072 0.5237 0.5678 0.5568
    MSE 2.6073 2.3370 2.4385 2.3576 2.2292 2.2759 2.1913 1.8477$ 1.8932 1.8299 1.6606 1.7027
    QLIKE 0.0862 0.0731 0.0752 0.0735 0.0679 0.0704 0.1073 0.0804 0.0839 0.0012 0.0760 0.0788
    (b) USDJPY
    Weekly Monthly
    AR ARQ HAR HARQ HARQ-F HARQ-h AR ARQ HAR HARQ HARQ-F HARQ-h
    β0 2.0305 1.1976* 1.3310 1.1591 0.8708 0.9646 2.5786 2.2815 1.7358 1.6356 1.3894 1.3900
    s.e. 0.2484 0.1550 0.1967 0.1646 0.1701 0.1564 0.1928 0.2245 0.2792 0.2678 0.3106 0.3151
    β1 0.3709 0.6801 0.0687 0.2722 0.1650 0.0668 0.2011* 0.3121 0.0286 0.1460 0.0829 0.0283
    s.e. 0.0717 0.0512 0.0266 0.0500 0.0373 0.0207 0.0363 0.0566 0.0119 0.0258 0.0166 0.0113
    β2 0.1294 0.0742 0.3558 0.4971 0.0865 0.0541 0.1886* 0.0923
    s.e. 0.0700 0.0609 0.0787 0.0790 0.0389 0.0333 0.0487 0.0376
    β3 0.3910 0.3147 0.2622 0.1829* 0.3460 0.3030 0.3340 0.4811
    s.e. 0.0703 0.0621 0.0959 0.0693 0.0916 0.0883 0.1346 0.1220
    β1Q 0.6085 0.1190 0.0571 0.2167 0.0678 0.0318
    s.e. 0.0534 0.0173 0.0141 0.0832 0.0093 0.0068
    β2Q 0.3653 0.5357 0.1659
    s.e. 0.0704 0.0648 0.0465
    β3Q -0.2750 -0.3946 0.7392
    s.e. 0.2010 0.2942 0.2900
    R2 0.1367 0.2323 0.1848 0.2270 0.2557 0.2475 0.1414 0.2106 .2205 0.2496 0.2761 0.2542
    MSE 11.6923 $ 10.3980 11.0412 10.4701 10.0811 10.1919 5.4365 4.9983 4.9351 4.7513 4.58326 4.7220
    QLIKE 0.2361 0.4197 0.2057 0.1937 0.4076 0.1405 0.2143 0.1973 0.1801 0.1734 0.1634 0.1680
    Note: In-sample parameter estimates for weekly (h=5) and monthly (h=22) forecasting models. EURUSD in upper panel (Table 5a) and USDJPY in lower panel (Table 5b). Robust standard errors (s.e.) using Newey and West (1987) accommodate autocorrelation up to order 10 (h=5), and 44 (h=22), respectively. Superscripts *, ** and *** represent statistical significance in a two-sided t-test at 1%, 5%, and 10% levels.

     | Show Table
    DownLoad: CSV

    Table 6 and Table 7 detail the out-of-sample performance for weekly and monthly forecasts, respectively. Notably, the HAR-J, CHAR, and SHAR models generally fail to demonstrate consistent improvements over the basic HAR model. This is a sharp contrast to the RQ-augmented models. The HARQ-F model outperforms the HAR model both for EURUSD and USDJPY for nearly all instances. Also, HARQ-h delivers forecasts that are relatively consistent with the HAR model. Judging from both weekly and monthly results, the inherent flexibility of the HARQ-F is beneficial also for longer-term forecasts. We note that, at the monthly forecasting horizon for USDJPY, there is some variability as to preferred Q-specifications. Also, in some monthly instances, the Diebold-Mariano null hypothesis of equal predictability cannot be rejected. This is not unreasonable, since the number of independent monthly observations naturally becomes lower than for corresponding shorter forecasting horizons, leading to higher parameter uncertainty and related noise in volatility estimates.

    Table 6.  Weekly out-of-sample forecast losses.
    EURUSD AR HAR HAR-J CHAR SHAR ARQ HARQ HARQ-F HARQ-h
    MSE-RW 1.3063 1.0000 0.9636 0.9884 1.0017 1.1459 0.9677 0.9024* 0.9205
    MSE-EW 1.2702 1.0000 0.9433 0.9559 0.9997 1.1288 0.9501 0.8996* 0.9117
    QLIKE-RW 1.5923 1.0000 0.9819 0.9840 0.9995 1.3558 0.9932 0.8701 0.9283
    QLIKE-EW 1.7682 1.0000 0.9874 1.0031 1.0033 1.4134 0.9648 0.8832* 0.9297
    USDJPY AR HAR HAR-J CHAR SHAR ARQ HARQ HARQ-F HARQ-h
    MSE-RW 1.0618 1.0000 0.9464 0.9509 0.9965 0.9064 0.8971 0.8393* 0.8443
    MSE-EW 1.1707 1.0000 1.0148 1.0021 1.0336 1.0194 0.9388 0.8993 0.8976*
    QLIKE-RW 1.3119 1.0000 1.0057 0.9910 0.9740 1.0493 0.9099 0.8246* 0.8359
    QLIKE-EW 1.3847 1.0000 0.9918 0.9768 1.0002 1.1391 0.9179 0.8350* 0.8463
    Note: Model performance, expressed as model loss normalized by the loss of the HAR model. Each row reflects a combination of estimation window and loss function. Ratio for the best-performing model on each row in bold. Corresponding asterix * and ** denote 1% and 5% confidence levels from Diebold-Mariano test for one-sided tests of superior performance of the best performing model compared to the HAR model.

     | Show Table
    DownLoad: CSV
    Table 7.  Monthly out-of-sample forecast losses.
    EURUSD AR HAR HAR-J CHAR SHAR ARQ HARQ HARQ-F HARQ-h
    MSE-RW 1.3289 1.0000 0.9876 0.9952 1.0003 1.1876 0.9625 0.8803* 0.9004
    MSE-IW 1.3265 1.0000 0.9759 1.0010 1.0044 1.1707 0.9537 0.8723 0.9070
    QLIKE-RW 1.4301 1.0000 0.9945 0.9950 0.9982 1.2380 0.9622 0.9215* 0.9279
    QLIKE-IW 1.5155 1.0000 0.9951 1.0051 1.0011 1.2596 0.9599 0.9333 0.9784
    USDJPY AR HAR HAR-J CHAR SHAR ARQ HARQ HARQ-F HARQ-h
    MSE-RW 1.2529 1.0000 1.0215 1.0086 0.9893 1.5820 1.0500 1.0070 0.9621*
    MSE-IW 1.2547 1.0000 1.0073 1.0029 1.0119 1.1181 0.9620 0.9495* 0.9780
    QLIKE-RW 1.1937 1.0000 1.0023 0.9963 0.9893 1.0313 0.9307 0.9454 1.0318
    QLIKE-IW 1.2894 1.0000 0.9959 0.9909 1.0000 1.1453 0.9452 0.8932* 1.0143
    Note: Model performance, expressed as model loss normalized by the loss of the HAR model. Each row reflects a combination of estimation window and loss function. Ratio for the best performing model on each row in bold. Corresponding asterix * and ** denote 1% and 5% confidence levels from Diebold-Mariano test for one-sided tests of superior performance of the best performing model compared to the HAR model.

     | Show Table
    DownLoad: CSV

    The intention of the HARQ model is to capture the heteroskedastic measurement error of realized variance. The HARQ model in (5) approximates this through the square root of RQ. Bollerslev et al. (2016) argues that this encounters possible issues with numerical stability. Still, this specification is somewhat ad-hoc and a number of reasonable alternatives exist. To clarify whether the performance of the HARQ model is sensitive to the definition of RQ, we follow Bollerslev et al. (2016) and substitute RQ,RQ1/2,RQ1, and log(RQ) in place of RQ1/2. Furthermore, we augment the standard HAR and HARQ models with RQ1/2 as an additional explanatory variable, which allows the HAR(Q) model intercept to be time-varying.

    Table 8 reports the out-of-sample forecast results from the alternative HARQ specifications. We normalize all losses by those of the HARQ model based on RQ1/2.

    Table 8.  Alternative HARQ Specifications.
    Alternative RQ transformations Adding RQ1/2
    EURUSD RQ RQ1/2 RQ1/2 RQ1 log(RQ) HAR HARQ
    MSE-RW 1.0023 1.0000 1.0263 1.0246 1.0092 1.0309 1.0052
    MSE-IW 1.0016 1.0000 1.0274 1.0265 1.0069 1.0292 1.0086
    QLIKE-RW 1.0042 1.0000 1.0326 1.0304 1.0007 1.0250 1.0067
    QLIKE-IW 1.0014 1.0000 1.0064 1.0254 0.9937 1.0044 1.0164
    USDJPY RQ RQ1/2 RQ1/2 RQ1 log(RQ) HAR HARQ
    MSE-RW 1.0001 1.0000 1.1345 1.1225 1.0516 1.1202 1.0118
    MSE-IW 1.0049 1.0000 1.0606 1.0543 0.9931 1.0512 1.0186
    QLIKE-RW 1.0097 1.0000 1.1439 1.1067 0.9794 1.0731 1.0455
    QLIKE-IW 1.0188 1.0000 1.1105 1.0841 0.9322 1.0358 0.9989
    Note: Model performance, expressed as model loss normalized by the loss of the HARQ model, relies on RQ1/2. Each row reflects a combination of estimation window and loss function. Ratio for the best, performing model on each row in bold. The left panel reports the results based on alternative RQ interaction terms. The right panel reports the results from including RQ1/2 as an explanatory variable.

     | Show Table
    DownLoad: CSV

    The two rightmost columns of Table 8 reveal that including RQ1/2 as an explanatory variable in the HAR and HARQ models does not lead to improved forecasts. Similarly, applying alternative RQ transformations does not appear to be helpful. Overall, we conclude that the HARQ model demonstrates greater stability and is generally favored over the alternative specifications.

    HARQ is essentially an expansion of the HAR model. In a similar vein, the other benchmark volatility models can be extended accordingly. Following Bollerslev et al. (2016), from the HAR-J model defined in (3.2), we construct the HARQ-J model;

    RVt=β0+(β1+β1QRQ1/2t1)RVt1+β2RVt1t5+β3RVt1t22+βJJt1+ut. (16)

    Furthermore, from the CHAR model defined in (3.2), we construct the CHARQ model;

    RVt=β0+(β1+β1QTPQ1/2t1)BPVt1+β2BPVt1t5+β3BPVt1t22+ut. (17)

    Lastly, from the SHAR model defined in (3.2), we construct the SHARQ model;

    RVt=β0+(β+1+β+1QRQ1/2t1)RV+t1+(β1+β1QRQ1/2t1)RVt1+β2RVt1t5+β3RVt1t22+ut. (18)

    Table 9 compares out-of-sample forecast results from each of the alternative Q-models (HARQ-J, CHARQ, and SHARQ), to their non-Q adjusted baseline specification. We also include the HARQ model. For both currencies, the enhancements seen in the HARQ-J and CHARQ models align with those observed in the basic HARQ model. This is in contrast to the SHARQ model, which is outperformed by SHAR. Bollerslev et al. (2016) report similar results.

    Table 9.  Out-of-sample forecast losses for alternative Q-models.
    EURUSD HARQ HARQ-J CHARQ SHARQ
    MSE-RW 0.9759 0.9693 0.9749 1.0613
    MSE-IW 0.9742 0.9563 0.9567 1.0315
    QLIKE-RW 0.9767 0.9845 0.9750 1.1473
    QLIKE-IW 0.9952 0.9960 0.9893 0.9987
    USDJPY HARQ HARQ-J CHARQ SHARQ
    MSE-RW 0.8885 0.8916 0.8914 1.0953
    MSE-IW 0.9446 0.9322 0.9389 0.8965
    QLIKE-RW 0.8824 0.8471 0.9040 1.3887
    QLIKE-IW 0.8949 0.8942 0.9178 0.8974
    Note: Model performance, expressed as model loss normalized by the loss of the relevant baseline models without the Q-adjustment terms. Each row reflects a combination of estimation window and loss function. Ratio for the best performing model on each row in bold.

     | Show Table
    DownLoad: CSV

    Recent history contains two independent events that separately have induced turbulence in the global macroeconomy and financial markets. One is the outbreak of COVID-19 in March 2020; another is the Russian invasion of Ukraine in the second half of 2022, as illustrated in Figure 1.

    Figure 1.  EURUSD realized variance.

    To analyze this period of extreme market conditions in isolation, we perform a sub-sample analysis covering 2020–2022. Table 10 contains out-of-sample results for day-ahead volatility forecasts. Reassuringly, the overall results remain intact, in that the HARQ-F model is the best performing model also when this extreme period is considered in isolation.

    Table 10.  Day ahead out-of-sample forecast losses, 2020–2022 subsample.
    EURUSD AR HAR HAR-J CHAR SHAR ARQ HARQ HARQ-F
    MSE-RW 1.2522 1.0000 0.9781 0.9745 1.0041 1.0425 0.9517 0.9304
    MSE-IW 1.2068 1.0000 0.9813 0.9764 0.9979 1.0976 0.9806 0.9677
    QLIKE-RW 1.3216 1.0000 1.0169 0.9829 1.0093 1.1370 0.9446 0.9065
    QLIKE-IW 1.5585 1.0000 1.0085 1.0119 1.0059 1.2338 0.9725 0.9701
    USDJPY AR HAR HAR-J CHAR SHAR ARQ HARQ HARQ-F
    MSE-RW 1.0930 1.0000 1.0555 0.9909 0.9822 0.9895 0.9564 0.9348
    MSE-IW 1.1099 1.0000 0.9958 0.9850 1.0112 1.0523 1.0071 0.9827
    QLIKE-RW 1.3404 1.0000 1.2635 1.0136 0.9845 0.9509 0.8611 0.8677
    QLIKE-IW 1.4766 1.0000 0.9939 0.9808 1.0108 1.0231 0.8453 0.7868
    Note: Model performance, expressed as model loss normalized by the loss of the HAR model. Each row reflects a combination of estimation window and loss function. Ratio for the best performing model on each row in bold.

     | Show Table
    DownLoad: CSV

    This study uses updated tick-level data from two major currency pairs, EURUSD and USDJPY, covering January 2010 to December 2022, to investigate the relevance of realized quarticity for out-of-sample volatility forecasts. We find that realized quarticity effectively captures noise caused by measurement errors, as evidenced by increased precision in daily, weekly, and monthly volatility estimates from models augmented with realized quarticity as an additionally explanatory variable. These results are robust across estimation windows, evaluation metrics, and model specifications. As such, the results conform to comparable studies from other markets, predominantly on equity indices and single stocks. This paper also complements the relatively scarce body of literature on foreign exchange markets in this context.

    A myriad of volatility models based on the HAR framework have been proposed. Still, simple linear HAR specifications have proven remarkably difficult to beat, as shown by Audrino et al. (2024) and Branco et al. (2024). In a recent survey, Gunnarsson et al. (2024) report promising results for machine learning models and volatility forecasting across asset classes. The FX implied volatility surface contains a rich set of relevant predictive information across forecasting horizons and quantiles (de Lange et al., 2022). Thus, combining implied volatilities and high-frequency data using machine learning models, along the lines of Blom et al. (2023), appears as an interesting avenue for future research.

    Rarely, one single model dominates others in terms of statistical and economic criteria. To this end, investigating ensemble models where high-frequency models are combined with other volatility model classes, such as time series models and stochastic volatility models-possibly including jump-processes, should be of interest. The recently developed rough-path volatility models based on fractional Brownian motion (Salmon and SenGuptz, 2021; Bayer et al., 2023) appear particularly relevant in this context.

    The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

    M.R.: Conceptualization, Methodology, Software, Formal analysis, Writing - Original Draft, Writing - Review & Editing.

    M.H.: Data Curation, Writing - Original Draft, Writing - Review & Editing.

    We would like to thank Andrew Patton for making the Matlab code from Bollerslev et al. (2016) available at https://public.econ.duke.edu/ap172/. Furthermore, we are grateful for insightful comments from the Editor and two anonymous reviewers, which helped us improve the paper.

    The authors declare no conflicts of interest.



    [1] M. Akram, M. Iqbal, M. Daniyal, A. U. Khan, Awareness and current knowledge of breast cancer, Biol. Res., 50 (2017), 33. doi: 10.1186/s40659-017-0140-9
    [2] S. Winters, C. Martin, D. Murphy, N. K. Shokar, Breast cancer epidemiology, prevention, and screening, Prog. Mol. Biol. Transl. Sci., 151 (2017), 1-32. doi: 10.1016/bs.pmbts.2017.07.002
    [3] Z. Anastasiadi, G. D. Lianos, E. Ignatiadou, H. V. Harissis, M. Mitsis, Breast cancer in young women: an overview, Updates Surg., 69 (2017), 313-317. doi: 10.1007/s13304-017-0424-1
    [4] S. S. Coughlin, Epidemiology of breast cancer in women, Adv. Exp. Med. Biol., 1152 (2019), 9-29. doi: 10.1007/978-3-030-20301-6_2
    [5] M. A. Thorat, R. Balasubramanian, Breast cancer prevention in high-risk women, Best Pract. Res. Clin. Obstet. Gynaecol., 65 (2020), 18-31. doi: 10.1016/j.bpobgyn.2019.11.006
    [6] F. Varghese, J. Wong, Breast cancer in the elderly, Surg. Clin. North. Am., 98 (2018), 819-833. doi: 10.1016/j.suc.2018.04.002
    [7] S. I. Bangdiwala, Regression: multiple linear, Int. J. Inj. Contr. Saf. Promot., 25 (2018), 232-236. doi: 10.1080/17457300.2018.1452336
    [8] Y. H. Hu, S. C. Yu, X. Qi, et al., An overview of multiple linear regression model and its application, Chi. J. Prev. Med., 53 (2019), 653-656.
    [9] R. Zemouri, N. Omri, C. Devalland, L. Arnould, B. Morello, N. Zerhouni, et al., Breast cancer diagnosis based on joint variable selection and constructive deep neural network, 2018 IEEE 4th Middle East Conference on Biomedical Engineering (MECBME), 2018.
    [10] R. Zemouri, N. Omri, B. Morello, C. Devalland, L. Arnould, N. Zerhouni, et al., Constructive deep neural network for breast cancer diagnosis, IFAC PapersOnLine, 51 (2018), 98-103.
    [11] S. Belciug, Artificial Intelligence in Cancer: Diagnostic to Tailored Treatment, Elsevier, New York, 2020.
    [12] Y. Deng, H. Xiao, J. Xu, H. Wang, Prediction model of PSO-BP neural network on coliform amount in special food, Saudi. J. Biol. Sci., 26 (2019), 1154-1160. doi: 10.1016/j.sjbs.2019.06.016
    [13] Z. Li, Y. Li, A comparative study on the prediction of the BP artificial neural network model and the ARIMA model in the incidence of AIDS, BMC Med. Inf. Decis. Mak., 20 (2020), 143. doi: 10.1186/s12911-020-01157-3
    [14] X. Liu, Z. Liu, Z. Liang, S. P. Zhu, J. A. F. O. Correia, A. M. P. De Jesus, PSO-BP neural network-based strain prediction of wind turbine blades, Materials, 12 (2019), 1889. doi: 10.3390/ma12121889
    [15] R. Zemouri, N. Omri, F. Fnaiech, N. Zerhouni, N. Fnaiech, A new growing pruning deep learning neural network algorithm (GP-DLNN), Neural Comput. Appl., 32 (2019), 18143-18159.
    [16] J. Li, W. Luo, Hospitalization expenses of acute ischemic stroke patients with atrial fibrillation relative to those with normal sinus rhythm, J. Med. Econ., 20 (2017), 114-120. doi: 10.1080/13696998.2016.1229322
    [17] M. E. Png, J. Yoong, C. S. Tan, K. S. Chia, Excess hospitalization expenses attributable to type 2 diabetes mellitus in Singapore, Value Health Reg. Issues, 15 (2018), 106-111. doi: 10.1016/j.vhri.2018.02.001
    [18] J. Wang, P. Li, J. Wen, Impacts of the zero mark-up drug policy on hospitalization expenses of COPD inpatients in Sichuan province, western China: an interrupted time series analysis, BMC Health Serv. Res., 20 (2020), 519. doi: 10.1186/s12913-020-05378-0
    [19] B. Aline, A. M. Zeina, Z. Ryad, S. Valmary-Degano, Prediction of Oncotype DX recurrence score using deep multi-layer perceptrons in estrogen receptor-positive, HER2-negative breast cancer, Breast Cancer, (2020), 1007-1016.
    [20] N. Pandis, Multiple linear regression analysis, Am. J. Orthod. Dentofacial Orthop., 149 (2016), 581. doi: 10.1016/j.ajodo.2016.01.012
    [21] D. G. Streeter, Practical statistics for medical research, New York, Chapman and Hall, 1991.
    [22] G. L. Yuan, L. Z. Liang, Z. F. Zhang, Q. L. Liang, Z. Y. Huang, H. J. Zhang, et al., Hospitalization costs of treating colorectal cancer in China: A retrospective analysis, Medicine, 98 (2019), e16718. doi: 10.1097/MD.0000000000016718
    [23] X. Zhuang, Y. Chen, Z. Wu, S. R. Scott, M. Zou, Analysis of hospitalization expenses of 610 HIV/AIDS patients in Nantong, China, BMC Health Serv. Res., 20 (2020), 813. doi: 10.1186/s12913-020-05687-4
    [24] J. Lyu, J. Zhang, BP neural network prediction model for suicide attempt among Chinese rural residents, J. Affect. Disord, 246 (2019), 465-473. doi: 10.1016/j.jad.2018.12.111
    [25] C. Zhang, R. Zhang, Z. Dai, B. Y. He, Y. Yao, Prediction model for the water jet falling point in fire extinguishing based on a GA-BP neural network, PLoS One, 14 (2019), e0221729. doi: 10.1371/journal.pone.0221729
    [26] R. Zemouri, N. Zerhouni, D. Racoceanu, Deep learning in the biomedical applications: recent and future status, Appl. Sci., 9 (2019), 1526. doi: 10.3390/app9081526
    [27] R. Zemouri, C. Devalland, S. Valmary-Degano, N. Zerhounid, Intelligence artificielle: quel avenir en anatomie pathologique?, Ann. de Pathol., 39 (2019), 119-129. doi: 10.1016/j.annpat.2019.01.004
  • Reader Comments
  • © 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(3737) PDF downloads(210) Cited by(5)

Figures and Tables

Figures(1)  /  Tables(6)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog