Forecasting future epidemics helps inform policy decisions regarding interventions. During the early coronavirus disease 2019 epidemic period in January–February 2020, limited information was available, and it was too challenging to build detailed mechanistic models reflecting population behavior. This study compared the performance of phenomenological and mechanistic models for forecasting epidemics. For the former, we employed the Richards model and the approximate solution of the susceptible–infected–recovered (SIR) model. For the latter, we examined the exponential growth (with lockdown) model and SIR model with lockdown. The phenomenological models yielded higher root mean square error (RMSE) values than the mechanistic models. When using the numbers from reported data for February 1 and 5, the Richards model had the highest RMSE, whereas when using the February 9 data, the SIR approximation model was the highest. The exponential model with a lockdown effect had the lowest RMSE, except when using the February 9 data. Once interventions or other factors that influence transmission patterns are identified, they should be additionally taken into account to improve forecasting.
Citation: Takeshi Miyama, Sung-mok Jung, Katsuma Hayashi, Asami Anzai, Ryo Kinoshita, Tetsuro Kobayashi, Natalie M. Linton, Ayako Suzuki, Yichi Yang, Baoyin Yuan, Taishi Kayano, Andrei R. Akhmetzhanov, Hiroshi Nishiura. Phenomenological and mechanistic models for predicting early transmission data of COVID-19[J]. Mathematical Biosciences and Engineering, 2022, 19(2): 2043-2055. doi: 10.3934/mbe.2022096
Related Papers:
[1]
Chayu Yang, Jin Wang .
A mathematical model for the novel coronavirus epidemic in Wuhan, China. Mathematical Biosciences and Engineering, 2020, 17(3): 2708-2724.
doi: 10.3934/mbe.2020148
[2]
Sarita Bugalia, Vijay Pal Bajiya, Jai Prakash Tripathi, Ming-Tao Li, Gui-Quan Sun .
Mathematical modeling of COVID-19 transmission: the roles of intervention strategies and lockdown. Mathematical Biosciences and Engineering, 2020, 17(5): 5961-5986.
doi: 10.3934/mbe.2020318
[3]
Sarafa A. Iyaniwura, Musa Rabiu, Jummy F. David, Jude D. Kong .
Assessing the impact of adherence to Non-pharmaceutical interventions and indirect transmission on the dynamics of COVID-19: a mathematical modelling study. Mathematical Biosciences and Engineering, 2021, 18(6): 8905-8932.
doi: 10.3934/mbe.2021439
[4]
Ayako Suzuki, Hiroshi Nishiura .
Transmission dynamics of varicella before, during and after the COVID-19 pandemic in Japan: a modelling study. Mathematical Biosciences and Engineering, 2022, 19(6): 5998-6012.
doi: 10.3934/mbe.2022280
[5]
Haoyu Wang, Xihe Qiu, Jinghan Yang, Qiong Li, Xiaoyu Tan, Jingjing Huang .
Neural-SEIR: A flexible data-driven framework for precise prediction of epidemic disease. Mathematical Biosciences and Engineering, 2023, 20(9): 16807-16823.
doi: 10.3934/mbe.2023749
[6]
Yukun Tan, Durward Cator III, Martial Ndeffo-Mbah, Ulisses Braga-Neto .
A stochastic metapopulation state-space approach to modeling and estimating COVID-19 spread. Mathematical Biosciences and Engineering, 2021, 18(6): 7685-7710.
doi: 10.3934/mbe.2021381
[7]
Quentin Griette, Jacques Demongeot, Pierre Magal .
What can we learn from COVID-19 data by using epidemic models with unidentified infectious cases?. Mathematical Biosciences and Engineering, 2022, 19(1): 537-594.
doi: 10.3934/mbe.2022025
[8]
Gabriel McCarthy, Hana M. Dobrovolny .
Determining the best mathematical model for implementation of non-pharmaceutical interventions. Mathematical Biosciences and Engineering, 2025, 22(3): 700-724.
doi: 10.3934/mbe.2025026
[9]
Pannathon Kreabkhontho, Watchara Teparos, Thitiya Theparod .
Potential for eliminating COVID-19 in Thailand through third-dose vaccination: A modeling approach. Mathematical Biosciences and Engineering, 2024, 21(8): 6807-6828.
doi: 10.3934/mbe.2024298
[10]
Sarah R. Al-Dawsari, Khalaf S. Sultan .
Modeling of daily confirmed Saudi COVID-19 cases using inverted exponential regression. Mathematical Biosciences and Engineering, 2021, 18(3): 2303-2330.
doi: 10.3934/mbe.2021117
Abstract
Forecasting future epidemics helps inform policy decisions regarding interventions. During the early coronavirus disease 2019 epidemic period in January–February 2020, limited information was available, and it was too challenging to build detailed mechanistic models reflecting population behavior. This study compared the performance of phenomenological and mechanistic models for forecasting epidemics. For the former, we employed the Richards model and the approximate solution of the susceptible–infected–recovered (SIR) model. For the latter, we examined the exponential growth (with lockdown) model and SIR model with lockdown. The phenomenological models yielded higher root mean square error (RMSE) values than the mechanistic models. When using the numbers from reported data for February 1 and 5, the Richards model had the highest RMSE, whereas when using the February 9 data, the SIR approximation model was the highest. The exponential model with a lockdown effect had the lowest RMSE, except when using the February 9 data. Once interventions or other factors that influence transmission patterns are identified, they should be additionally taken into account to improve forecasting.
1.
Introduction
In December 2019, clusters of atypical pneumonia cases driven by a novel coronavirus (SARS-CoV-2) emerged in Wuhan, China [1]. A rapid surge of coronavirus disease 2019 (COVID-19) cases was identified initially in Hubei Province, and by January 23, 2020, it involved a total of 25 provinces in China, with 571 cases and 17 deaths [2]. The Chinese government responded by implementing intensive control measures now referred to as a "lockdown." This started with Wuhan on January 23, 2020, and the rest of Hubei Province in the following days, by shutting down domestic and international flights, as well as trains, buses, subways, and ferries across the affected areas [3,4,5]. Newly reported cases in China reduced greatly soon after the lockdown's implementation and it was lifted on April 8, 2020 [6]. By then, the number of cumulative cases and deaths had reached 83,161 and 3,342, respectively [7].
Real-time forecasts of future incidence can provide valuable insights into the scale and control of an epidemic, and can help assess the effects of possible interventions [8,9]. Mechanistic [10,11] and phenomenological [12,13,14,15,16] mathematical models have been used for forecasting epidemics. Phenomenological models can efficiently capture epidemic trajectory by simply fitting the model to the incidence data as a function of time [17,18,19]. A frequently used forecasting methodology is the Richards model [20], which expresses a flexible S-shaped curve with a single inflection point; i.e., the epidemic peak. Forecasting with this model tends to be certain when the data contain the epidemic peak [18]. Mechanistic models such as the susceptible–infected–recovered (SIR) compartment model [21] can capture mechanisms of transmission dynamics, incorporating heterogeneities such as age dependence and spatial variations, and can be useful for evaluating interventions' effectiveness. During the early COVID-19 epidemic period, however, limited information was available; building detailed mechanistic models that reflected population behavior was, thus, too challenging. However, forecasting without an intervention effect could yield extremely different results from those acquired through observation [11].
An important question is whether mechanistic models should proactively be employed for short-time forecasting, during the very early stage of an epidemic, even if the mechanisms considered in the models are limited/simple (i.e., homogeneously mixing population is assumed and mechanistic details of public health countermeasures yet remain completely unknown). For the COVID-19 epidemic, the lockdown policy was a severe and intense countermeasure. In line with this, the present study aimed to compare the forecasting performance of phenomenological models and mechanistic models, the latter of which can incorporate the lockdown effect.
2.
Materials and methods
2.1. Epidemiological data
We retrieved daily incidence data of confirmed COVID-19 cases in China by reporting date from January 4 (the first case reported through surveillance) through February 18 from the World Health Organization (WHO) website [7]. This time frame encompassed the start of the epidemic to roughly 2 weeks after the epidemic's peak.
From February 13, the definitions in the reporting criteria in China were revised, which was followed by an abrupt increase in the number of cases for February 13 and 14. These numbers were considered outliers, and we therefore excluded these 2 days of data from the analysis. In total, we analyzed incidence data of 53,330 confirmed cases.
2.2. Models
We used four different models for forecasting in this study. The first two were phenomenological: the Richards model [20] and the approximate solution of the basic SIR differential equations (SIR approximation model) [22]. The latter two were the exponential growth (exponential with lockdown) model and the SIR (SIR with lockdown) model, incorporating the lockdown effect by changing the growth rate/transmission parameter before and after the lockdown. In these models, we assumed the start of the lockdown was January 28, 2020, the date that all prefecture-, county- level cities, and an autonomous prefecture, except for the Shennongjia Forestry District, in Hubei Province were subject to it [3].
2.2.1. Phenomenological models
2.2.1.1. Richards model
First, we used the Richards model [18,20] to fit the observed data and predict the epidemic. This model is known for real-time prediction of outbreak and real-time detection of turning points. The model's basic premise is that the incidence curve consists of a single peak of high incidence, resulting in an S-shaped epidemic curve and a single turning point of the outbreak [18]. The cumulative incidence was expressed as the following formula:
Ccumt=K/[1+e−r(t−tm)](1/a),
(1)
where Ccumt is the cumulative number of infected cases at time t in days; K is the carrying capacity or total number of cases in the outbreak; r is the per capita growth rate of the infected population; and a is the exponent of deviation from the standard logistic curve. ti is the inflection point of the S-shaped epidemic curve obtained from this model, while tm=ti+(lna)/r is equal to the inflection point ti when a equals 1. From this model, the basic reproduction number R0, or the average number of infections one infectious individual causes in an entirely susceptible population, is said to be computed as R0=exp(r/γ) for a constant generation time [18], where 1/γ is the mean generation time. Throughout this study, the mean generation time was assumed to be the identical to the mean serial interval, at 4.8 days [23].
2.2.1.2. Approximate solution of the basic SIR differential equations
Second, the basic differential equations of the SIR model (dSdt=−βSI, dIdt=βSI−γI, dRdt=γI) have an approximate solution for the epidemic curve, Cnewt: the number of new cases reported each day. Cnewt is expressed as follows [24]:
Cnewt=γα2ρ22S0sech2(12αγt−φ),
(2)
α=((S0ρ−1)2+2S0I0ρ2)12,
(3)
φ=tanh−11α(S0ρ−1),
(4)
which is generally a symmetrical, bell-shaped curve, where β and γ are the transmission parameter and recovery rate, respectively, and S0 and I0 are the initial number of susceptible and infected individuals, respectively. ρ=γ/β is the relative recovery rate; thus, the basic reproduction number R0, can be calculated as follows R0=S0/ρ.
2.2.2. Mechanistic models with a lockdown effect
2.2.2.1. Exponential with lockdown model
We employed the intervention, lockdown, effect in mechanistic models. First, we used an exponential model by dividing newly infected people before and after the lockdown. We modeled the number of newly infected people per day as:
where i0 is the initial number of infected people, r1 and r2 are the growth rate before and after implementation of the lockdown, tlockdown. R0, can be calculated using the formula R0=(1+r1v2/γ)1/v2[10,25], where v is the coefficient of variation of the generation time [26], for which we adopted 0.5.
2.2.2.2. SIR with lockdown model
We then applied the time discrete SIR model for newly infected cases per day. The following equation was used for the model:
St+1=St−St(1−exp(−βItN)),
(6)
It+1=It+inewt+1−It(1−exp(−γ)),
(7)
Rt+1=Rt+It(1−exp(−γ)),
(8)
where St, It, and Rt are, respectively, the numbers of populations in the susceptible, infected, and recovered compartments on day t. γ is the recovery rate. N indicates the population size of China—1.4 billion—and is equal to the total population of the S, I, and R compartments. We expressed newly infected cases (inew) at time t+1 as the following equation:
where β1 and β2 are transmission parameters before and after the start of the lockdown. Not all infected carriers were reported, but the ascertainment rate, p, can be seen as about 10% [27], and the carrying capacity of reported cases would be approximately 10% of all infected carriers. We treated the number of inew×p individuals as the observed data, considering the ascertainment rate. In the SIR model, R0, was calculated using the formula R0=β1/γ.
Note that for the estimation of R0, we did not differentiate the impact of various interventions, i.e., different effectiveness of lockdown policy and other countermeasures such as contact tracing, isolation and mask waring. Because the estimation of intrinsic R0 without any impact of interventions cannot be attained, here it should be noted that our estimate reflects underlying interventions.
2.2.2.3. Convolution for the mechanistic models
As the number of reported cases in China were the observed data, the probability density function of time delay from infection to reporting was convoluted with inew(t) to acquire the expected number of reported cases, Cnew(t), which is modeled as:
Cnew(t)=∫t0inew(t−u)f(u)du,
(10)
f(t)=(g∗h)(t),
(11)
where g(t) and h(t) are the probability density functions of the incubation period (mean: 5.6 days, standard deviation: 3.9 days [28]) and the onset-report delay (mean: 4.9 days, standard deviation: 3.3 days [29]).
2.3. Calibration
Using the above formulae, we fitted the observational reported data to Ccum(t) of the Richards model and Cnew(t) of the SIR approximation model and the two mechanistic models (i.e., exponential and SIR with lockdown models) using maximum likelihood estimations with Poisson errors. We performed a bootstrapping method [30] with 10,000 iterations to have 95% confidence intervals (CIs) for the parameters estimated; this produced epidemic curves with 95% CIs. We used R statistical software (R Foundation for Statistical Computing, Vienna, Austria) [31] to perform these analyses.
2.4. Prediction assessment
We generated a 7-day forecast from each model for calibrating three different data cutoff points to evaluate the forecasting capability during the course of infection. The cutoff points were: February 1 (before the epidemic peak), 5 (peak), and 9 (after the peak). We compared root mean squared errors (RMSEs) and relative RMSEs for the forecasting among models calibrated using three different data periodseach cutoff datum. We defined RMSE as follows:
where ˆct is the number of newly reported cases forecasted and ct is those observed. As the reporting definition in China was changed after February 13 and the reporting rate changed, 1/α is unknown, and we consequently performed a sensitivity analysis assuming α of 1, 0.9, 0.8, and 0.7. To calculate RMSE and relative RMSE, we used ct multiplied by α for the changed reporting rate period. We omitted the reported number of cases on February 13 and 14 from the analysis, with the assumption they were outliers.
3.
Results
Table 1 shows the R0 estimated from the four models above (Richards, SIR approximation, exponential with lockdown, and SIR with lockdown) using three different data periods (cutoff dates of February 1, 5, and 9). The R0s of the phenomenological models (Richards and SIR approximation models) decline, as the data points used for the calibration vary from before the epidemic peak to after (R0: 2.6–7.7), while those for the mechanistic models (exponential with lockdown and SIR with lockdown) are relatively stable (R0: 2.4–3.3), irrespective of the data period used. Table S1 shows all results of the other parameters estimated.
Table 1.
Basic reproduction numbers, R0, estimated from the model calibrations using COVID-19 reported cases in China. Three different data periods (cutoff dates of February 1, 5, and 9) were used for the calibrations.
The 95% confidence interval derived from profile likelihood is given in parentheses. The estimated incidence represents infection, inclusive of mild and asymptomatic cases. SIR: susceptible–infected–recovered.
Figure 1 shows the model fit to the observed data and 7 days of forecasting. Table 2 shows RMSEs and relative RMSEs calculated for the forecasting. When we used the reported data by February 1 (before the epidemic curve peak) and February 5 (peak), the Richards model had the highest RMSE value (15362 and 1375), whereas when we used the data by February 9 (after the peak), the SIR approximation model had the highest RMSE value. This trend was stable throughout the different reporting rates used for the sensitivity analysis (1529, 1402, 1259, and 1121 with α of 1, 0.9, 0.8, and 0.7, respectively). The exponential with lockdown model had the lowest RMSE, except for when we used the data by February 9 without the reporting rate adjustment, in which the SIR with lockdown model had the lowest RMSE (627). The results of relative RMSE had almost the same trend as for RMSE, except when the data by February 9 were used with a reporting rate adjustment of α = 0.9. Throughout the data cutoff points, the mechanistic models tended to have lower RMSE/relative RMSE values than the phenomenological models except for the following: the RMSE of the phenomenological Richards model (652,500) was lower than that of the mechanistic SIR with lockdown model (882, 1025) when αs were 0.8 and 0.7 from the sensitivity analysis; and the relative RMSE of the Richards model (0.41) was lower than that of the SIR with lockdown model (0.49) when α was 0.7.
Figure 1.
Estimated number of cases from calibration and 7 days forecasting from the Richards, susceptible–infected–recovered (SIR) approximation, exponential with lockdown, and SIR with lockdown models in China by date of reporting. Calibrations were conducted using three different data cutoff points: February 1 (red), 5 (green), and 9 (blue). Solid lines with shaded areas show medians and 95% confidence intervals for calibration, while dashed lines with light-shaded areas show medians and 95% prediction intervals for forecasting. Gray bars show the number of cases by reporting date, and those on February 13 and 14 were omitted for the forecasting period, with the assumption they were outliers.
Table 2.
Root mean square errors (RMSEs) and relative RMSEs for the 7 days of forecasting from the models calibrated using three different periods (cutoff dates of February 1, 5, and 9) of reported COVID-19 cases in China.
Cutoff date
Richards
SIR approximation
Exponential with lockdown
SIR with lockdown
RMSE
1-Feb
15362
1858
1585
1604
5-Feb
1375
1244
457
934
9-Feb
968
1549
823
627
α*=0.9
809
1402
665
747
α=0.8
652
1259
510
882
α=0.7
500
1121
364
1025
Relative RMSE
1-Feb
1.49
0.89
0.71
0.72
5-Feb
0.73
0.69
0.18
0.32
9-Feb
0.67
1.59
0.52
0.26
α=0.9
0.60
1.51
0.44
0.32
α=0.8
0.51
1.43
0.35
0.40
α=0.7
0.41
1.33
0.26
0.49
*For RMSE and relative RMSE calculations from February 15, the numbers of reported cases observed were adjusted by multiplying α because of the definition change of reported cases for the sensitivity analysis (i.e., the reporting rate changed was assumed as 1/α). The reported case data for February 13 and 14 were omitted for the RMSE and relative RMSE calculations, with the assumption they were outliers. SIR: susceptible–infected–recovered.
In this study, simple mechanistic models that account for the lockdown effect forecasted the short-term future incidence of COVID-19 better than the phenomenological models. This finding was consistent irrespective of the epidemic phases (i.e., data points used for the calibration). In the data-limited setting, such as in the early phase of the epidemic in China, phenomenological models were useful because they required only the incidence data of the infectious disease and they captured the overall epidemic trajectory. Once interventions or other factors that may influence the transmission patterns (e.g., the lockdown), occurred, however, the epidemic dynamics changed considerably and it became essential to account for the change in the forecasting models.
The Richards model forecasted the epidemic better than the SIR approximation (other phenomenological) model when data were available until after the epidemic peak. The Richards model, also known as the generalized logistic model, has been used to predict the spread of infectious diseases during previous epidemics, such as with foot-and-mouth disease in the United Kingdom [33] and the Ebola outbreak in West Africa [15,19,34,35]. It was also applied to forecast the incidence of the ongoing COVID-19 epidemic (now pandemic) [12,13]. The Richards model is well known for its ability to express flexibility of epidemic curves, meaning a deviation from the standard logistic curve can be captured [18] and fluctuations in data at hand before the epidemic peak greatly affect the final size (i.e., data can dramatically alter the final size). In this study, however, once the epidemic passed its peak, the Richards model forecast was comparable with the SIR with lockdown model.
The SIR approximation model forecasted the epidemic better than the Richards model when epidemic data used for the calibration preceded the epidemic peak. Although the SIR approximation model has less interpretive ability compared with the Richards model, and generally is simply a symmetrical bell-shaped curve, there is potential merit in having this modeling option to evaluate the forecasting especially in the epidemic's early phase (i.e., before the peak).
In the sensitivity analysis considering the reporting rate change (1/α) from February 13, the trend of forecasting ability was not consistent between the SIR with lockdown and Richards models (see Table 2). The RMSEs/relative RMSEs of the SIR with lockdown model were smaller than those of the Richards model when αs were 0.9/0.9–0.8, but the result was reversed when αs were 0.8–0.7/0.7, although the RMSE/relative RMSE differences between the two models were small. This was because the SIR with lockdown model overestimated the number of reported cases while the Richards model underestimated it. Overestimation from the SIR with lockdown model could be due to the great number of individuals in the susceptible compartment, assuming that S0 is equal to the total population in China and homogeneous mixing. The exponential with lockdown model did not need these assumptions (homogeneous/heterogeneous mixing or the number of susceptible individuals in China's total population), and most effectively forecasted the future incidence. This model had both mechanistic and phenomenological model aspects, as the intervention mechanism was incorporated into the exponential curve. Note that the data used for the analyses were the reported numbers of cases in China, and an under-ascertainment rate may influence data. We assumed the constant under-ascertainment rate, which might be in reality time dependent, although, due to the relatively short time horizon of our study period, the time dependence may not have a large impact. When further epidemiological data (e.g., time-dependent seroprevalence) are available, validation of the observed data and forecasting should be addressed in further research.
When one model overestimates the number of cases and the other underestimates it, relative RMSE maybe suitable for the model validation. This is because the absolute number of the difference between prediction and observation, which is used in RMSE calculation, has a different meaning between the two models, even if they have the same value. In this study, this is evidenced in the RMSE of the SIR with lockdown model being larger than that of the Richards model when the data cutoff point was February 9 and α was 0.8, whereas the relative RMSE of the SIR with lockdown model was smaller than that of the Richards model in the same condition (i.e., data cutoff point: February 9; α: 0.8).
This study involved several limitations. First, as mentioned above, in this study, the mechanistic models generally forecasted better than the phenomenological models. In an epidemic's early phase, however, when there is high uncertainty in the underlying transmission dynamics, forecasting future incidence with an SIR mechanistic model without any intervention effect may produce biased results [11]. The effect of China's lockdown assessed in this study also was clear and had substantial impacts. Note that if unknown effects that influence the disease transmission are not employed into mechanistic models, the better forecasting observed in the mechanistic models may change (i.e., forecasting could be less accurate). Second, short-term forecasting during the very early period of pandemic was conducted in this study. Other published studies explored the heterogeneity of populations and multiple factors that influence the epidemic dynamics [36,37], which we assumed unavailable yet during the process of forecasting. It should be noted that consistent result to ours may not be obtained if the scope is extended to longer period of forecasting. Third, the R0 values estimated from our model might be influenced by interventions other than lockdown measure in China. The R0 from the phenomenological model was obtained when an inflection point was properly identified. Due to the uncertainty of the incidence data and the extent of interventions implemented in the early phase of the epidemic, the estimate has fluctuated substantially. Therefore, the R0 values estimated were comparable between phenomenological and mechanistic models only when data after the epidemic peak was used; however, this was not the case when the dataset before epidemic peak was used. The R0 estimation from the phenomenological model was influenced by the above-mentioned issue (i.e., data uncertainty and the extent of interventions implemented at the time in the early phase), while the R0 from the mechanistic model reflected overall intervention effect before lockdown measure (thus, more stable regardless of the data cutoff points). We did not account for this time-varying intervention effect [38,39] in the mechanistic model, because our exercise assumed a shortage of information in the very beginning of an epidemic.
5.
Conclusions
In conclusion, this study investigated the forecasting ability of phenomenological and simple mechanistic models. The mechanistic models considering a lockdown effect forecasted better than the phenomenological models. To effectively capture disease dynamics, integrated interpretations from both phenomenological and mechanistic models are required as factors such as the epidemic's phase, interventions implemented, and population behavior influence the results of forecasting.
H.N. received funding from the Health and Labour Sciences Research Grant (19HA1003, 20CA2024, 20HA2007, and 21HB1002); Japan Agency for Medical Research and Development (AMED; JP20fk0108140 and JP20fk0108535); the Japan Society for the Promotion of Science (JSPS) KAKENHI (17H04701 and 21H03198); Environment Research and Technology Development Fund (JPMEERF20S11804) of the Environmental Restoration and Conservation Agency of Japan; the Inamori Foundation; GAP Fund Program of Kyoto University; and the Japan Science and Technology Agency (JST) CREST program (JPMJCR1413) and the SICORP program (JPMJSC20U3 and JPMJSC2105). T.M. received JSPS KAKENHI (19K24219). S.-m.J. received JSPS KAKENHI (20J2135800). K.H. received JSPS KAKENHI (20K18953). R.K. received JSPS KAKENHI (21K17307). T.K. received JSPS KAKENHI (21K10467). N.M.L. received a graduate scholarship Japanese Ministry of Education, Culture, Sports, Science, and Technology (MEXT). A.S. received JSPS KAKENHI (19K24159). T.K. received JSPS KAKENHI (21K10495).
Conflict of interest
All authors declare no conflicts of interest in this paper.
The Paper, Xiangyang Railway Station is closed, the last prefecture-level city in Hubei Province is "closed", Shanghai Oriental Press, 2020. Available from: https://www.thepaper.cn/newsDetail_forward_5671283.
C. S. Lutz, M. P. Huynh, M. Schroeder, S. Anyatonwu, F. S. Dahlgren, G. Danyluk, et al., Applying infectious disease forecasting to public health: A path forward using influenza forecasting examples, BMC Public Health, 19 (2019), 1659. doi: 10.1186/s12889-019-7966-8. doi: 10.1186/s12889-019-7966-8
[9]
L. S. Fischer, S. Santibanez, R. J. Hatchett, D. B. Jernigan, L. A. Meyers, P. G. Thorpe, et al., CDC grand rounds: Modeling and public health decision-making, Morb. Mortal. Wkly. Rep., 65 (2016), 1374–1377. doi: 10.15585/mmwr.mm6548a4. doi: 10.15585/mmwr.mm6548a4
[10]
K. Hayashi, T. Kayano, S. Sorano, H. Nishiura, Hospital caseload demand in the presence of interventions during the COVID-19 pandemic: a modeling study, J. Clin. Med., 9 (2020), 3065. doi: 10.3390/jcm9103065. doi: 10.3390/jcm9103065
[11]
J. T. Wu, K. Leung, G. M. Leung, Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: A modelling study, Lancet, 395 (2020), 689–697. doi: 10.1016/S0140-6736(20)30260-9. doi: 10.1016/S0140-6736(20)30260-9
[12]
K. Roosa, Y. Lee, R. Luo, A. Kirpich, R. Rothenberg, M. James, et al., Short-term forecasts of the COVID-19 epidemic in Guangdong and Zhejiang, China: February 13–23, 2020. J. Clin. Med., 9 (2020), 596. doi: 10.3390/jcm9020596. doi: 10.3390/jcm9020596
[13]
K. Roosa, Y. Lee, R. Luo, A. Kirpich, R. Rothenberg, J. M. Hyman, et al., Real-time forecasts of the COVID-19 epidemic in China from February 5th to February 24th, 2020. Infect. Dis. Model., 5 (2020), 256–263. doi: 10.1016/j.idm.2020.02.002. doi: 10.1016/j.idm.2020.02.002
[14]
G. Chowell, D. Hincapie-Palacio, J. Ospina, B. Pell, A. Tariq, S. Dahal, et al., Using phenomenological models to characterize transmissibility and forecast patterns and final burden of Zika epidemics, PLoS Curr., 8 (2016), ecurrents.outbreaks.f14b2217c902f453d9320a43a35b9583. doi: 10.1371/currents.outbreaks.f14b2217c902f453d9320a43a35b9583. doi: 10.1371/currents.outbreaks.f14b2217c902f453d9320a43a35b9583
[15]
B. Pell, Y. Kuang, C. Viboud, G. Chowell, Using phenomenological models for forecasting the 2015 Ebola challenge. Epidemics, 22 (2018), 62–70. doi: 10.1016/j.epidem.2016.11.002. doi: 10.1016/j.epidem.2016.11.002
[16]
N. Balak, D. Inan, M. Ganau, C. Zoia, S. Sönmez, B. Kurt, et al., A simple mathematical tool to forecast COVID-19 cumulative case numbers, Clin. Epidemiol. Glob. Health, 12 (2021), 100853. doi: 10.1016/j.cegh.2021.100853. doi: 10.1016/j.cegh.2021.100853
[17]
F. Y. Hsieh, D. A. Bloch, M. D. Larsen, A simple method of sample size calculation for linear and logistic regression, Stat. Med., 17 (1998), 1623–1634. doi: 10.1002/(sici)1097-0258(19980730)17:14<1623::aid-sim871>3.0.co; 2-s. doi: 10.1002/(sici)1097-0258(19980730)17:14<1623::aid-sim871>3.0.co;2-s
[18]
Y.-H. Hsieh, Richards model: A simple procedure for real-time prediction of outbreak severity, in Modeling and Dynamics of Infectious Diseases (eds. Z. Ma, Y. Zhou and J. Wu), World Scientific Pub. Co. Inc., (2009), 216–236. doi: 10.1142/9789814261265_0009.
[19]
K. Roosa, A. Tariq, P. Yan, J. M. Hyman, G. Chowell, Multi-model forecasts of the ongoing Ebola epidemic in the Democratic Republic of Congo, March-October 2019, J. R. Soc. Interface, 17 (2020), 20200447. doi: 10.1098/rsif.2020.0447. doi: 10.1098/rsif.2020.0447
[20]
F. J. Richards, A flexible growth function for empirical use, J. Exp. Bot., 10 (1959), 290–301. doi: 10.1093/jxb/10.2.290. doi: 10.1093/jxb/10.2.290
[21]
M. J. Keeling, P. Rohani, Introduction to simple epidemic models, in Modeling Infectious Diseases in Humans and Animals, Princeton University Press, (2008), 15–53. doi: 10.2307/j.ctvcm4gk0.
[22]
N. T. Bailey, General epidemics, in The Mathematical Theory of Infectious Diseases, 2nd ed, Hafner Press, (1975), 81–102.
[23]
H. Nishiura, N. M. Linton, A.R. Akhmetzhanov, Serial interval of novel coronavirus (COVID-19) infections, Int. J. Infect. Dis., 93 (2020), 284–286. doi: 10.1016/j.ijid.2020.02.060. doi: 10.1016/j.ijid.2020.02.060
[24]
W. O. Kermack, A. G. Mckendrick, A contribution to the mathematical theory of epidemics, Proc. R. Soc. A,115 (1927), 700–721. doi: 10.1098/rspa.1927.0118. doi: 10.1098/rspa.1927.0118
[25]
S. Jung, A. R. Akhmetzhanov, K. Hayashi, N. M. Linton, Y. Yang, B. Yuan, et al., Real-time estimation of the risk of death from novel coronavirus (COVID-19) infection: Inference using exported cases, J. Clin. Med., 9 (2020), 523. doi: 10.3390/jcm9020523. doi: 10.3390/jcm9020523
[26]
J. Wallinga, M. Lipsitch, How generation intervals shape the relationship between growth rates and reproductive numbers. Proc. R. Soc. B, 274 (2007), 599–604. doi: 10.1098/rspb.2006.3754. doi: 10.1098/rspb.2006.3754
[27]
H. Nishiura, T. Kobayashi, Y. Yang, K. Hayashi, T. Miyama, R. Kinoshita, et al., The rate of underascertainment of novel coronavirus (2019-nCoV) infection: Estimation using Japanese passengers data on evacuation flights, J. Clin. Med., 9 (2020), 419. doi: 10.3390/jcm9020419. doi: 10.3390/jcm9020419
[28]
N. M. Linton, T. Kobayashi, Y. Yang, K. Hayashi, A. R. Akhmetzhanov, S. Jung, et al., Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: A statistical analysis of publicly available case data, J. Clin. Med., 9 (2020), 538. doi: 10.3390/jcm9020538. doi: 10.3390/jcm9020538
[29]
K. Leung, J. T. Wu, D. Liu, G. M. Leung, First-wave COVID-19 transmissibility and severity in China outside Hubei after control measures, and second-wave scenario planning: A modelling impact assessment. Lancet, 395 (2020), 1382–1393. doi: 10.1016/S0140-6736(20)30746-7. doi: 10.1016/S0140-6736(20)30746-7
[30]
G. Chowell, Fitting dynamic models to epidemic outbreaks with quantified uncertainty: A primer for parameter uncertainty, identifiability, and forecasts, Infect. Dis. Model., 2 (2017), 379–398. doi: 10.1016/j.idm.2017.08.001. doi: 10.1016/j.idm.2017.08.001
[31]
31. R Core Team, R: A language and environment for statistical computing, R Foundation for Statistical Computing, 2020. Available from: https://www.R-project.org/.
[32]
M. Li, J. Dushoff, B. M. Bolker, Fitting mechanistic epidemic models to data: A comparison of simple Markov chain Monte Carlo approaches, Stat. Methods Med. Res., 27 (2018), 1956–1967. doi: 10.1177/0962280217747054. doi: 10.1177/0962280217747054
[33]
D. W. Shanafelt, G. Jones, M. Lima, C. Perrings, G. Chowell, Forecasting the 2001 Foot-and-Mouth Disease epidemic in the UK, Ecohealth, 15 (2018), 338–347. doi: 10.1007/s10393-017-1293-2. doi: 10.1007/s10393-017-1293-2
[34]
W. Liu, S. Tang, Y. Xiao, Model selection and evaluation based on emerging infectious disease data sets including A/H1N1 and Ebola, Comput. Math. Methods Med., 2015 (2015), 207105. doi: 10.1155/2015/207105. doi: 10.1155/2015/207105
[35]
Y. H. Hsieh, Temporal course of 2014 Ebola virus disease (EVD) outbreak in West Africa elucidated through morbidity and mortality data: A tale of three countries, PLoS One, 10 (2015) 1–12. doi: 10.1371/journal.pone.0140810. doi: 10.1371/journal.pone.0140810
[36]
A. V. Tkachenko, S. Maslov, A. Elbanna, G. N. Wong, Z. J. Weiner, N. Goldenfeld, Time-dependent heterogeneity leads to transient suppression of the COVID-19 epidemic, not herd immunity, Proc. Natl. Acad. Sci. USA,118 (2021), e2015972118. doi: 10.1073/PNAS.2015972118. doi: 10.1073/PNAS.2015972118
[37]
COVIDSurg Collaborative, GlobalSurg Collaborative, SARS-CoV-2 vaccination modelling for safe surgery to save lives: data from an international prospective cohort study, Br. J. Surg., 108 (2021), 1056–1063. doi: 10.1093/bjs/znab101.
[38]
B. F. Maier, D. Brockmann, Effective containment explains subexponential growth in recent confirmed COVID-19 cases in China, Science, 368 (2020), 742–746. doi: 10.1126/science.abb4557. doi: 10.1126/science.abb4557
[39]
M. Djordjevic, M. Djordjevic, B. Ilic, S. Stojku, I. Salom, Understanding infection progression under strong control measures through universal COVID-19 growth signatures, Glob. Chall., 5 (2021), 2000101. doi: 10.1002/gch2.202000101. doi: 10.1002/gch2.202000101
Miljana Milić, Jelena Milojković, Miljan Jeremić,
Optimal Neural Network Model for Short-Term Prediction of Confirmed Cases in the COVID-19 Pandemic,
2022,
10,
2227-7390,
3804,
10.3390/math10203804
2.
Ji Li, Yue Li, Zihan Mei, Zhengkun Liu, Gaofeng Zou, Chunxia Cao,
Mathematical models and analysis tools for risk assessment of unnatural epidemics: a scoping review,
2024,
12,
2296-2565,
10.3389/fpubh.2024.1381328
3.
Jacques Demongeot, Pierre Magal,
Data-driven mathematical modeling approaches for COVID-19: A survey,
2024,
50,
15710645,
166,
10.1016/j.plrev.2024.08.004
Table 1.
Basic reproduction numbers, R0, estimated from the model calibrations using COVID-19 reported cases in China. Three different data periods (cutoff dates of February 1, 5, and 9) were used for the calibrations.
Table 2.
Root mean square errors (RMSEs) and relative RMSEs for the 7 days of forecasting from the models calibrated using three different periods (cutoff dates of February 1, 5, and 9) of reported COVID-19 cases in China.
Cutoff date
Richards
SIR approximation
Exponential with lockdown
SIR with lockdown
RMSE
1-Feb
15362
1858
1585
1604
5-Feb
1375
1244
457
934
9-Feb
968
1549
823
627
α*=0.9
809
1402
665
747
α=0.8
652
1259
510
882
α=0.7
500
1121
364
1025
Relative RMSE
1-Feb
1.49
0.89
0.71
0.72
5-Feb
0.73
0.69
0.18
0.32
9-Feb
0.67
1.59
0.52
0.26
α=0.9
0.60
1.51
0.44
0.32
α=0.8
0.51
1.43
0.35
0.40
α=0.7
0.41
1.33
0.26
0.49
*For RMSE and relative RMSE calculations from February 15, the numbers of reported cases observed were adjusted by multiplying α because of the definition change of reported cases for the sensitivity analysis (i.e., the reporting rate changed was assumed as 1/α). The reported case data for February 13 and 14 were omitted for the RMSE and relative RMSE calculations, with the assumption they were outliers. SIR: susceptible–infected–recovered.
*For RMSE and relative RMSE calculations from February 15, the numbers of reported cases observed were adjusted by multiplying α because of the definition change of reported cases for the sensitivity analysis (i.e., the reporting rate changed was assumed as 1/α). The reported case data for February 13 and 14 were omitted for the RMSE and relative RMSE calculations, with the assumption they were outliers. SIR: susceptible–infected–recovered.
Figure 1. Estimated number of cases from calibration and 7 days forecasting from the Richards, susceptible–infected–recovered (SIR) approximation, exponential with lockdown, and SIR with lockdown models in China by date of reporting. Calibrations were conducted using three different data cutoff points: February 1 (red), 5 (green), and 9 (blue). Solid lines with shaded areas show medians and 95% confidence intervals for calibration, while dashed lines with light-shaded areas show medians and 95% prediction intervals for forecasting. Gray bars show the number of cases by reporting date, and those on February 13 and 14 were omitted for the forecasting period, with the assumption they were outliers