Mathematical assessment of the role of waning and boosting immunity against the BA.1 Omicron variant in the United States

Salman Safdar; Calistus N. Ngonghala; Abba B. Gumel; Salman Safdar; Calistus N. Ngonghala; Abba B. Gumel

doi:10.3934/mbe.2023009

Mathematical Biosciences and Engineering

2023, Volume 20, Issue 1: 179-212. doi: 10.3934/mbe.2023009

Previous Article Next Article

Research article Special Issues

Mathematical assessment of the role of waning and boosting immunity against the BA.1 Omicron variant in the United States

1.
School of Mathematical and Statistical Sciences, Arizona State University, Tempe, AZ 85287, USA
2.
Department of Mathematics, University of Florida, Gainesville, FL 32611, USA
3.
Emerging Pathogens Institute, University of Florida, Gainesville, FL 32610, USA
4.
Department of Mathematics, University of Maryland College Park, MD 20742, USA
5.
Department of Mathematics and Applied Mathematics, University of Pretoria, Pretoria 0002, South Africa

Received: 09 July 2022 Revised: 02 September 2022 Accepted: 13 September 2022 Published: 30 September 2022

Three safe and effective vaccines against SARS-CoV-2 have played a major role in combating COVID-19 in the United States. However, the effectiveness of these vaccines and vaccination programs has been challenged by the emergence of new SARS-CoV-2 variants of concern. A new mathematical model is formulated to assess the impact of waning and boosting of immunity against the Omicron variant in the United States. To account for gradual waning of vaccine-derived immunity, we considered three vaccination classes that represent high, moderate and low levels of immunity. We showed that the disease-free equilibrium of the model is globally-asymptotically, for two special cases, if the associated reproduction number is less than unity. Simulations of the model showed that vaccine-derived herd immunity can be achieved in the United States via a vaccination-boosting strategy which entails fully vaccinating at least $59\%$ of the susceptible populace followed by the boosting of about $72\%$ of the fully-vaccinated individuals whose vaccine-derived immunity has waned to moderate or low level. In the absence of boosting, waning of immunity only causes a marginal increase in the average number of new cases at the peak of the pandemic, while boosting at baseline could result in a dramatic reduction in the average number of new daily cases at the peak. Specifically, for the fast immunity waning scenario (where both vaccine-derived and natural immunity are assumed to wane within three months), boosting vaccine-derived immunity at baseline reduces the average number of daily cases at the peak by about 90% (in comparison to the corresponding scenario without boosting of the vaccine-derived immunity), whereas boosting of natural immunity (at baseline) only reduced the corresponding peak daily cases (in comparison to the corresponding scenario without boosting of natural immunity) by approximately 62%. Furthermore, boosting of vaccine-derived immunity is more beneficial (in reducing the burden of the pandemic) than boosting of natural immunity. Finally, boosting vaccine-derived immunity increased the prospects of altering the trajectory of COVID-19 from persistence to possible elimination.

Keywords:

Citation: Salman Safdar, Calistus N. Ngonghala, Abba B. Gumel. Mathematical assessment of the role of waning and boosting immunity against the BA.1 Omicron variant in the United States[J]. Mathematical Biosciences and Engineering, 2023, 20(1): 179-212. doi: 10.3934/mbe.2023009

Related Papers:

[1]	Jianing Wang . Prediction of postoperative recovery in patients with acoustic neuroma using machine learning and SMOTE-ENN techniques. Mathematical Biosciences and Engineering, 2022, 19(10): 10407-10423. doi: 10.3934/mbe.2022487
[2]	Yunxiang Meng, Qihong Duan, Kai Jiao, Jiang Xue . A screened predictive model for esophageal squamous cell carcinoma based on salivary flora data. Mathematical Biosciences and Engineering, 2023, 20(10): 18368-18385. doi: 10.3934/mbe.2023816
[3]	Sarth Kanani, Shivam Patel, Rajeev Kumar Gupta, Arti Jain, Jerry Chun-Wei Lin . An AI-Enabled ensemble method for rainfall forecasting using Long-Short term memory. Mathematical Biosciences and Engineering, 2023, 20(5): 8975-9002. doi: 10.3934/mbe.2023394
[4]	Lili Jiang, Sirong Chen, Yuanhui Wu, Da Zhou, Lihua Duan . Prediction of coronary heart disease in gout patients using machine learning models. Mathematical Biosciences and Engineering, 2023, 20(3): 4574-4591. doi: 10.3934/mbe.2023212
[5]	Hu Dong, Gang Liu, Xin Tong . Influence of temperature-dependent acoustic and thermal parameters and nonlinear harmonics on the prediction of thermal lesion under HIFU ablation. Mathematical Biosciences and Engineering, 2021, 18(2): 1340-1351. doi: 10.3934/mbe.2021070
[6]	Ji Zhou, Zhanlin Su, Shahab Hosseini, Qiong Tian, Yijun Lu, Hao Luo, Xingquan Xu, Chupeng Chen, Jiandong Huang . Decision tree models for the estimation of geo-polymer concrete compressive strength. Mathematical Biosciences and Engineering, 2024, 21(1): 1413-1444. doi: 10.3934/mbe.2024061
[7]	Yuanyao Lu, Kexin Li . Research on lip recognition algorithm based on MobileNet + attention-GRU. Mathematical Biosciences and Engineering, 2022, 19(12): 13526-13540. doi: 10.3934/mbe.2022631
[8]	Sidra Abid Syed, Munaf Rashid, Samreen Hussain . Meta-analysis of voice disorders databases and applied machine learning techniques. Mathematical Biosciences and Engineering, 2020, 17(6): 7958-7979. doi: 10.3934/mbe.2020404
[9]	Hyeonjeong Ahn, Hyojung Lee . Predicting the transmission trends of COVID-19: an interpretable machine learning approach based on daily, death, and imported cases. Mathematical Biosciences and Engineering, 2024, 21(5): 6150-6166. doi: 10.3934/mbe.2024270
[10]	Jinyi Tai, Chang Liu, Xing Wu, Jianwei Yang . Bearing fault diagnosis based on wavelet sparse convolutional network and acoustic emission compression signals. Mathematical Biosciences and Engineering, 2022, 19(8): 8057-8080. doi: 10.3934/mbe.2022377

Abstract

1. Introduction

In the last decades, in the context of time series, several bootstrap procedures have been proposed to infer properties of a statistic of interest, including both parametric and non-parametric resampling schemes. Anyway, the effectiveness of the different bootstrap procedures is related to their ability to capture the dependent probabilistic structure of the underlying stochastic process under study, and to the analytical properties of the particular statistic considered.

When it is reasonable to impose parametric type assumptions on the process, the residual bootstrap is usually a sensible choice. In this general bootstrap scheme, the dependence structure of the series is modelled explicitly by using a parametric model and the bootstrap sample is drawn from the fitted model. This approach can be easily implemented, leading to very efficient results, but it is very sensitive to model misspecification, in which case it leads to bootstrap estimators, which might be not consistent. In this latter case, or when it is possible to account only for some mixing or weak dependence structure, more complex and fully non-parametric bootstrap methods are required. In this context, sieve bootstrap schemes have become useful and powerful tools to capture the dependence structure of the data without imposing any rigid parametric model specification.

The basic idea of this approach is to use non-parametric estimators as sieve approximators. In particular, the stochastic process under study is approximated by a family of (semi-) parametric models which, in a proper sense, contains the original process. Fixed an appropriate model selection rule, a model is picked from the set of the identified family and estimated on the observed dataset. Residual bootstrap is then implemented on the previously estimated model. In the context of sieve bootstrap schemes, the most used approach is the AR-Sieve bootstrap procedure ^[1,2,3]. It is based on the method of autoregressive process sieve in which an $AR (p_{T})$ model is fitted to the observed data and a bootstrap sample is generated by resampling from the centred residuals. This resampling scheme retains the simplicity of the classical residual bootstrap while being a nonparametric bootstrap scheme. It enjoys the properties of a plug-in rule. Moreover, it does not exhibit artefacts in the dependence structure like in the blockwise bootstrap, and there is no need for 'pre-vectorizing' the original observations. For these reasons, the AR-Sieve bootstrap has been largely used in the literature for constructing prediction intervals for linear processes ^{[4,5,6,7,8,9]}; for unit root testing ^[10] and stationarity testing ^[11], for fractionally integrated and non-invertible processes ^[12,13]. More recently, it has been used also in the context of functional time series ^[14] and spatial processes ^[15].

However, the AR-Sieve bootstrap performs better than other bootstrap techniques if the data generating process is linear and representable as an AR( $\infty$ ) process. Moreover, for quite general processes, it is expected to deliver consistent results if the asymptotic distribution of a given statistic is determined solely by the first and second order moment structure ^[16] or if the original time series is transformed and a more complex residual bootstrap scheme is applied ^[17].

For general nonlinear processes, an approach based on the use of feedforward Neural Networks (NN) as sieve approximators has been proposed ^[18]. The NN-Sieve resampling scheme, which is non-parametric in its spirit, retains the conceptual simplicity of the AR-Sieve bootstrap and it delivers consistent results for quite general nonlinear processes. Moreover, it performs similarly to the AR-Sieve bootstrap for linear processes while it outperforms the AR-Sieve bootstrap and the moving block bootstrap for nonlinear processes, both in terms of bias and variability ^[19].

However, despite their proven theoretical capabilities of non-parametric data-driven universal approximation of general nonlinear functions, NNs face challenging issues concerning the computational burden, which can be very heavy especially for complex non-linear generating processes. This might be a serious concern, even with computational power available nowadays, when using computer intensive model selection techniques (such as cross-validation) to tune the neural network hyper-parameters (such as hidden layer size, weight decay, etc.) in a resampling scheme involving neural networks. Moreover, in many applications of the bootstrap, to incorporate uncertainty due to model estimation, the statistical model needs to be estimated on each bootstrap resampled data. This makes the use of the bootstrap unfeasible when applied to complex estimation procedures. Finally, many statistical problems could be solved in practice only by using iterative bootstrap, which of course requires very efficient estimation procedures.

To overcome these problems, our proposal is to estimate the neural network model in the NN-Sieve procedure by using learning without iterative tuning. This approach, known as Extreme Learning Machines (ELM) in the computational intelligence and machine learning literature ^[20,21], has been extensively studied and remarkable contributions have been made both in theories and applications. By using ELMs, a nonlinear autoregressive sieve bootstrap scheme can be implemented for general non-linear time series. This scheme has the advantage to dramatically reduce the computational burden of the overall procedure; moreover, it has performances comparable to the NN-Sieve bootstrap and computing time comparable to the AR-Sieve bootstrap.

The paper is organized as follows. In section 2 a brief review of neural networks for time series is introduced in the context of Nonlinear Autoregressive (NAR) time series. In section 3 the extreme learning machine approach is presented and discussed, highlighting the advantages of its use with respect to the classical neural network approach. In section 3, NAR Sieve bootstrap based on ELMs is proposed, emphasizing the improvement with respect to alternative bootstrap schemes. In section 4 a simulation experiment is performed in order to evaluate the performance of the proposed bootstrap procedure and to compare it with the NN approach. Some remarks close the paper.

2. Neural networks for time series analysis

Let $\left\{ {Y_t, t \in \mathbb{Z}} \right\}$ be a real valued stochastic process modeled as a nonlinear autoregressive process with exogenous components:

$\begin{equation} Y_t = m\left( Y_{t-1}, \ldots, Y_{t-p}, {{\mathbf X}_{t} } \right) + \varepsilon _t \end{equation}$

(2.1)

where $m(\cdot)$ is an unknown (possibly nonlinear) function, $\varepsilon_t$ are i.i.d. innovations with mean 0 and finite variance and ${\mathbf X}_{t}$ is a $d$ -dimensional stochastic process representing other explicative variables, useful in predicting $Y_t$ or other functionals related to $Y_t$ .

By denoting with $\mathcal{I}_{t}$ the information set available at time $t$ and using the abbreviation ${\mathbf Z}_{t} = (Y_{t-1}, \ldots, Y_{t-p}, {{\mathbf X}_{t} })$ the conditional expectation of $Y_t$ , given the information set $\mathcal{I}_{t}$ is given by:

$\begin{equation} \mathbb{E}(Y_t|\mathcal{I}_{t}) = m({\mathbf Z}_{t}) \end{equation}$

(2.2)

There are many different parametric approaches to modelling the function $m$ and they can give quite different answers in the range of perhaps most interest to practitioners. In many cases, they are not willing to assume any parametric form (avoiding, in this way, model mispecification errors) and this motivates a nonparametric approach, because of the greater fexibility in functional form thereby allowed. For models with simple lag structure and without the presence of exogenous variables, the problem has been thoroughly investigated using local smoothers, such as in ^[22].

However, the use of local smoothers, due the so called "curse of dimensionality", does not allow complex lag structure or inclusion of other explanatory variables. These issues lead Franke and Diagne ^[23] to propose an alternative approach based on feedforward neural networks with one hidden layer. The function $m$ can be approximated by using neural networks with a single output and additive nodes in the class:

$\begin{equation} \mathcal{F} = \left \{ f({\mathbf z}, \boldsymbol{\eta}): {\mathbf z}\in \mathbb{R}^ {p+d}, \boldsymbol{\eta} \in \mathbb{R}^ {r(p+d+2)} \right \} \end{equation}$

(2.3)

with:

$\begin{equation} f_r\left( {{\mathbf z}, \boldsymbol{\eta} } \right) = \sum\limits_{k = 1}^r {\beta_k}\psi \left ( {\mathbf{a}}'_k {\mathbf{z}}+b_k\right ) \end{equation}$

(2.4)

where $r$ is the hidden layer size, $\psi(\cdot)$ is a sigmoidal activation function; $\boldsymbol{\eta} = (\beta_1, \ldots, \beta_r, {\mathbf{a}}'_1, {\mathbf{a}}'_2, \ldots, {\mathbf{a}}'_r, b_1, \ldots, b_r)'$ ; $\left \{ {\mathbf{a}}_k \right \}$ are the $(p+d)$ dimensional vectors of weights for the connections between input layer and hidden layer; $\left \{\beta_k \right \}$ are the weights of the link between the hidden layer and the output; $\left \{b_k \right \}$ are the bias terms of the hidden neurons.

Suppose that the processs $Y_{t}$ and ${\mathbf{X}}_{t}$ are observed for $T$ consecutive time periods generating the time series $y_{1}, y_{2}, \ldots, y_{T}$ and ${\mathbf{x}}_{1}, {\mathbf{x}}_{2}, \ldots, {\mathbf{x}}_{T}$ along with appropriate initial conditions. Let ${\mathbf{u}}_{t} = (y_{t}, {\mathbf{x}}_{t})$ and ${\mathbf{z}}_{t} = (y_{t-1}, y_{t-2}, \ldots, y_{t-p}, {\mathbf{x}}_{t})$ . A consistent estimate of the regression function $m(\cdot)$ can be obtained as:

$\begin{equation} \hat m = \text{argmin}_{f \in\mathcal{F}} \left \|f({\mathbf z}_{t}, \boldsymbol{\eta}) - {\mathbf{y}}\right \| \end{equation}$

(2.5)

where $\| \cdot \|$ denotes the $L_{2}$ -norm. Alternatively, to improve the stability of the network solution, a regularized version of the optimization problem 2.5 can be used:

$\begin{equation} \hat m = \text{argmin}_{f \in\mathcal{F}} \left \|f({\mathbf z}_{t}, \boldsymbol{\eta}) - {\mathbf{y}}\right \| + \lambda \left \| \boldsymbol{\eta}\right \| \end{equation}$

(2.6)

where the tuning parameter $\lambda$ is usually fixed by cross-validation.

Neural networks provide an arbitrarily accurate approximation to the unknown target function which satisfy certain smoothness conditions. In particular, under quite general conditions on the activation function $\psi$ , there exists a sequence of network functions $\left \{ f_r \right \}$ approximating to any given continuous target function $m$ with any expected learning error. A deterministic approximation rate (in $L_{2}$ -norm) of $r_T^{-1/2}$ for NNs with sigmoid activation functions has been obtained in ^[24]. For better rates see ^[25,26,27]. If the network model is fitted to the data in such a way that complexity of the network is allowed to increase at a proper rate with the sample size, the resulting function estimator can then be viewed as a nonparametric sieve estimator ^[28,29]. Moreover, estimation of hidden layer size ( $r_T$ ) seems to be less critical than estimation of the window size in local nonparametric approaches. Finally, extension of the sieve bootstrap procedure to high dimensional models is (much) more straightforward than other nonparametric approaches (absence of 'curse of dimensionality').

However, despite their proven theoretical capabilities of non-parametric data driven universal approximation of a general class of nonlinear functions, NNs face other challenging issues. Firstly, the use of the NNs requires the specification of the network topology in accordance with the underlying structure of the series. It involves the specification of the size and the structure of the input layer, the size of the hidden layer, the signal processing within nodes (i.e., the choice of the activation function). Furthermore, in the context of time series analysis, the characteristics of the series and the presence of deterministic and/or stochastic components such as trend, seasonality, structural breaks and level shift impose also an accurate feature selection. Many strategies have been proposed to solve these problems (for example, ^[30,31]) but the difficulty to find a unique method able to automatically identify the optimal NN still remains an open issue.

Moreover, once the neural network architecture is fixed, the estimation of the parameters can be made by using the backpropagation algorithm, which is essentially a first order gradient method for parameter optimization and suffers from slow convergence and local minimum problem. Also in this context, various ways to improve the efficiency or optimality in training a neural network have been proposed. They include second order optimization methods ^[32], subset selection methods ^[33] or global optimization methods ^[34]. Although these methods lead to faster training speed and, in general, better generalization performance compared to the back propagation algorithm, most of them still cannot guarantee a global optimal solution.

3. Extreme learning machines in a nutshell

Recently, extreme learning machine (ELM) for training NNs has attracted the attention in the literature ^[35,36,37] as a possible method to overcome some challenges faced by the other techniques.

The essence of ELMs is that, unlike the other traditional learning algorithms, such as back propagation based neural networks, the hidden nodes weights are randomly generated and they need not to be tuned, so that the algorithm analytically determines the output weights of NNs.

Basically, ELM trains a NN in two main stages. In the first, ELM randomly initializes the hidden layer to map the input data into a feature space by some non linear functions. As in NN theory, they can be any nonlinear piecewise continuous functions such as the sigmoid or the hyperbolic functions. The hidden node parameters $({\mathbf{ a}}, b)$ are randomly generated according to any continuos probability distribution so that the matrix:

$\begin{equation} {\mathbf{H}} = \begin{bmatrix} \psi({\mathbf{a}}'_1{\mathbf{z}}_1+b_1) & \cdots & \psi({\mathbf{a}}'_r{\mathbf{z}}_1+b_r) \\ \vdots & \ddots & \vdots \\ \psi({\mathbf{a}}'_1{\mathbf{z}}_{T}+b_1) & \cdots & \psi({\mathbf{a}}'_r{\mathbf{z}}_{T}+b_r) \end{bmatrix} \end{equation}$

(3.1)

is completely known. In the second stage, the output weights $\boldsymbol{\beta}$ are estimated by solving the following minimization problem:

$\begin{equation} \hat{\boldsymbol{\beta}} = \text{argmin}_{\boldsymbol{\beta}}\left \| {\mathbf{H}} \boldsymbol{\beta}-\mathbf{y} \right \| \end{equation}$

(3.2)

where ${\mathbf{y}}$ is the training data target vector and $\left \| \cdot \right \|$ denotes the $L_{2}$ -norm.

If ${\mathbf{H}}^\dagger$ denotes the Moore-Penrose generalized inverse of matrix ${\mathbf{H}}$ , the optimal solution to the previous optimization problem is:

$\begin{equation} \hat{\boldsymbol{\beta}} = {\mathbf{H}}^\dagger\mathbf{y} \end{equation}$

(3.3)

The matrix ${\mathbf{H}}^\dagger$ can be calculated by using one of the numerous methods proposed in the literature which include orthogonal projection, orthogonalization method, iterative method and the single value decomposition, the last one being the most general.

The estimation of the parameter vector $\boldsymbol{\beta}$ can also be obtained via regularized ELM ^[38]. If ${\mathbf{H}}$ has more rows than columns ( $T > r$ ), which is usually the case when the number of training data is larger than the number of hidden neurons, the following closed form solution can be obtained:

$\begin{equation} \hat{\boldsymbol{\beta}} = \left({\mathbf{H}}^{\prime}\mathbf{H} + \frac{\mathbf{I}}{C}\right)^{-1} {\mathbf{H}}^{\prime} {\mathbf{y}} \end{equation}$

(3.4)

where ${\mathbf{I}}$ is an identity matrix of dimension $r$ and $C$ is a proper chosen constant.

If the number of training data is less than the number of hidden neurons ( $T < r$ ), an estimate for $\boldsymbol{\beta}$ can be obtained as:

$\hat{\boldsymbol{\beta}} = {\mathbf{H}}^{\prime}\left({\mathbf{H}}\mathbf{H}^{\prime} + \frac{\mathbf{I}}{C}\right)^{-1} {\mathbf{y}}$

where ${\mathbf{I}}$ is an identity matrix of dimension $T$ this time.

It can be shown that Eq. 3.4 actually aims at minimizing:

$\hat{\boldsymbol{\beta}} = \text{argmin}_{\boldsymbol{\beta}}\left \| {\mathbf{H}} \boldsymbol{\beta}-\mathbf{y} \right \| + \frac{1}{C}\left \|\boldsymbol{\beta} \right\|$

Comparing to standard ELM, in which the target is to minimize $\left \| {\mathbf{H}} \boldsymbol{\beta}-\mathbf{y} \right \|$ , an extra penalty term $\frac{1}{C}\|\boldsymbol{\beta}\|$ is added to the target of standard ELM. This is actually consistent to the theory that smaller output weights $\boldsymbol{\beta}$ play an important role for ELM in achieving better generalization ability.

The ELM approach have several advantages. Firstly, it has good generalization performance in the sense that it reaches the small training error and, contemporaneously, the smallest norm of output weights. Secondly, learning can be done without iteratively tuning the hidden nodes which can be independent of training data. Moreover, ELMs, like NNs, enjoy the property of being universal approximators. Given any non constant piecewise continuous function $\psi$ , if

$\text{spam}\left \{\psi({\mathbf{a}}, b, {\mathbf{z}}) : ({\mathbf{a}}, b) \in \mathbb{R}^{p+d}\times \mathbb{R}\right \}$

is dense in $L^2$ , for any continuous target function $m$ and any sequence $\psi({\mathbf{a}}^{\prime}_k{\mathbf{z}} + b_k)$ for $k = 1, \ldots, r$ randomly generated according to any continuous sampling distribution and if the output weights $\hat{\boldsymbol{\beta}}$ are determined by ordinary least square to minimize:

$\left \| m({\mathbf{z}})- \sum\limits_{k = 1}^r \hat{\boldsymbol{\beta}}\psi({\mathbf{a}}^{\prime}_k{\mathbf{z}}+ b_k)\right \|$

it can be shown ^[39,40,41] that, with probability one, it is:

$\begin{equation} \lim\limits_{r \to \infty}\left \| m-f_r \right \| = 0 \end{equation}$

(3.5)

This result states the universal approximation capability of ELMs without imposing any restrictive assumption on the activation function as in the case of NN paradigm in which, on the contrary, a continuous and differentiable activation function is needed. In practice, being the hidden layer randomly generated, ELMs usually require more hidden neurons than NNs to obtain a given performance. However, this does not seem to be a serious problem due to the computational efficiency of ELMs. Moreover, ELMs are well suited for large data processing and, even if a model selection process is implemented for an optimal structure searching, the running time of ELMs is always lower than other competing strategies. In any case, parallel and cloud computing techniques ^[42] can also be used for even faster implementation of ELMs.

4. NAR Sieve bootstrap based on ELM

Given a general real valued stochastic process $\{Y_t, t \in \mathbb{Z}\} \sim M_0$ modeled as in Eq. 2.1, the basic idea of a sieve bootstrap scheme is to approximate the process by a family of (semi-) parametric models $\mathcal{M} = \{M_ j, j \in \mathbb{N}\}$ such that $\cup_{j = 1}^\infty M_ j$ contains (in some sense) the original process $M_0$ . Fixed a proper model selection rule, a model is picked from the set $\mathcal{M}$ and estimated. Residual bootstrap is based on the previous estimated model. Hence, a key issue is the selection of a proper model family.

Algorithm 1 The NAR-Sieve bootstrap scheme.

1: Fix

$B$ , the number of bootstrap runs
2: Consider the time series

$\left\{ {\mathbf{u}}_{t}, t=1, 2, \ldots, T \right\}$ with

${\mathbf{u}}_{t}=(y_{t}, {\mathbf{z}}_{t})$
3: Let

${\mathbf{z}}_{t}=\left\{ (y_{t-1}, y_{t-2}, \ldots, y_{t-p}, {\mathbf{x}_t}), t=p+1, \ldots, T \right\}$
4: Fix

$n_{1}$ , the observations to be discarded in order to make negligible the effect of starting values.
5: Let

$N = n_{1}+T$
6: Estimate

$m(\cdot)$ by using an ELM, obtaining

$\hat m(\cdot)$
7: Compute the residuals

$\hat \varepsilon_t =y_t- \hat m ({\mathbf z}_{t})$
8: Compute the centered residuals

${\tilde \varepsilon _t} = {\hat \varepsilon _t} - {\left({T - p} \right)^{ - 1}}\sum\limits_{t = p + 1}^T {{{\hat \varepsilon }_t}}.$
9: Denote the empirical distribution function of the centered residuals

${\tilde \varepsilon _t }$ by

$F_{\tilde \varepsilon}\left(x \right) = \left({T - p} \right)^{ - 1} \sum\limits_{t = p+1}^T {\mathbb{I}\left({\tilde \varepsilon _t \leq x} \right)}$
where

$\mathbb{I}(\cdot)$ is the indicator function.
10: for

$b=1, 2, \ldots, B$ do
11: Resample for

$t=p+1, p+2, \ldots, N$

$\varepsilon_{(b, t)}^*\stackrel{iid}{\sim} F_{\tilde \varepsilon}$
12: Fix

$y_{(b, t)}^*=\bar y$ for

$t=1, \ldots, p$ . Define

${\mathbf{z}}^{*}_{(b, t)}=(y^{*}_{(b, t-1)}, \ldots, y^{*}_{(b, t-p)}, {\mathbf{x}}_{t})$ .
13: Define

$y_{(b, t)}^*$ by recursion

$y_{(b, t)}^* = \hat m \left({\mathbf z}_{(b, t)}^* \right)+ \varepsilon _{(b, t)}^*; \; t = p+1, \ldots, N$
14: Compute

$\hat \theta^{*}_{(b, T)} = q \left({\mathbf{u}^{*}_{(b, 1)}, {\mathbf{u}}_{(b, 2)}, \ldots, {\mathbf{u}}^{*}_{(b, T)}} \right)$
15: end for
16: Use the empirical distribution function

$\hat F^*\left(x \right) = B^{ - 1} \sum\limits_{b = 1}^B {\mathbb{I}\left({\hat \theta^*_{(b, T)} \leq x} \right)}$
to approximate the unknown sampling distribution of the estimator

$\hat \theta_n$ .

To elaborate, let $\left\{ {\mathbf{u}}_{t} = (y_{t}, {\mathbf{x}}_t), t = 1, 2, \ldots, T \right\}$ be the observed time series, and let $\theta$ a finite dimensional parameter of interest and $\hat \theta_T = q \left({\mathbf{u}_1, {\mathbf{u}}_{2}, \ldots, {\mathbf{u}}_T} \right)$ a scalar-, vector- or curve-valued estimator, which is a measurable function of the data. Inference on $\theta$ can be gained by using the NN-Sieve bootstrap approach. The procedure, proposed in ^[18], exploits the good properties of neural network modelling and it is shown to be asymptotically justified, delivering consistent results for quite general non linear models, and it yields satisfactory results for finite sample size ^[19].

Here, we propose to approximate the unknown function $m(\cdot)$ with the class of ELMs with fixed input neurons and hidden layer size going to infinity with the time series length at a proper rate. The resampling procedure is detailed in algorithm 1.

The proposed resampling scheme has some advantages which make it effective in many applicative fields. As the NN-Sieve bootstrap, this approach is asymptotically justified and it delivers consistent results for quite general nonlinear processes ^[19]. Moreover, it does not suffer for the so-called 'curse of dimensionality'. Theoretically, ELMs are expected to perform better than other approximation methods since the approximation form is not so sensitive to the increasing dimension, at least within the confines of particular classes of functions.

As neural networks, ELMs are global nonparametric methods and their use could stress different features and data structures when compared to local nonparametric methods. With respect the NN-Sieve bootstrap, this scheme dramatically reduces the computational burden of the overall procedure having a computing time comparable to the AR-Sieve bootstrap.

5. Simulation results

In this section, we discuss the results of a simulation experiment performed in order to evaluate the performance of the proposed procedure and to compare it with the NN approach. All computations were implemented in the language R (version 3.6.0) using the package nnet (for feedforward neural networks with regularization) and ad hoc implementation by the authors for ELMs and ELMs with regularization. This latter implementation is based on the package ridge ^[43] which also includes the selection of the regularization parameter. All computations were made by using two different workstations: the first running MacOS (version 10.14.4) with 4 GHz Intel Core i7 and 16 GB 1600 MHz DDR3 memory, using the standard R math library and, the second running Ubuntu Linux (version 18.04) with a 3.50 Ghz Intel Xeon E5–1650 and 32 Gb ECC DDR4 memory, using OpenBlas.

In the first place, the computational advantage of ELMs is evaluated in terms of computing time of the learning process. In Tables 1 and 2 some descriptive statistics for the computing time for NNs, ELMs and ELMs with regularization are reported for a regression problem with two levels of complexity. The first model has 2 predictors, 2 neurons in the hidden layer and it has been estimated on a sample of 300 observations. In this case, the median execution time for the learning process of a full neural network (with ten random restarts) is 54 times slower with respect to the learning process of ELM with regularization and about 300 times slower with respect to a learning process based on the Moore-Penrose generalized inverse. The second model is much more complex, with 16 predictors, 40 neurons in the hidden layer and it has been estimated on a sample of 2000 observations. In this case, the median execution time for the learning process of a full neural network (with ten random restarts) is about 478 times slower with respect to the learning process of ELM with regularization and about 1575 times slower with respect to a learning process based on the Moore-Penrose generalized inverse. The more complex the model, the higher the advantage of using ELMs. As a remark, note that it is well known that the standard R math library does not deliver the best performance for linear algebra operations. So the figures reported in Tables 1 and 2 could be significantly improved using better Basic Linear Algebra System (BLAS) implementations such as the OpenBlas or Intel's Math Kernel Library (MKL), as evident from the results reported in the following.

Table 1. Descriptive statistics of computing time (in milliseconds) for feedforward neural networks (NN), extreme learning machines (ELM) and extreme learning machines with regularization (Reg), for a sample size of

$300$ observations, input size = 2, hidden layer size = 2. Computations made on a MacOS with 4 GHz Intel Core i7 and 16 GB 1600 MHz DDR3, using the standard R math library.

Method	min	$1^{st}$ quart.	mean	median	$3^{rd}$ quart.	max	runs
NN	37.54	48.22	54.14	54.75	59.02	73.29	100
Reg	0.75	0.95	1.40	1.02	1.13	24.56	100
ELM	0.11	0.16	0.40	0.18	0.20	21.99	100

| Show Table

DownLoad: CSV

Table 2. Descriptive statistics of computing time (in milliseconds) for feedforward neural networks (NN), extreme learning machines (ELM) and extreme learning machines with regularization (Reg), for a sample size of

$2,000$ observations, input size = 16, hidden layer size = 40. Computations made on a MacOS with 4 GHz Intel Core i7 and 16 GB 1600 MHz DDR3, using the standard R math library.

Method	min	$1^{st}$ quart.	mean	median	$3^{rd}$ quart.	max	runs
NN	15886.80	17575.38	17906.21	17791.85	18252.19	20371.12	100
Reg	32.86	35.11	41.46	37.18	43.88	256.92	100
ELM	8.62	10.85	11.97	11.29	11.77	22.29	100

| Show Table

DownLoad: CSV

In and , the relative execution time between NN, ELM and ELM with regularization is reported, for different number of input neurons $d \in \{4, 8, 12, 16\}$ , different hidden layer size $r = 2, 3, \ldots, 40$ and samples of different size $\{300,500, 1000, 2000\}$ . In all cases the computational gain when using ELM is very significant, ranging from about 300 times (for the simplest model with two input neurons and two hidden neurons) to 6,000 times for the most complex model considered (16 input neurons and 40 hidden neurons) by using OpenBlas. Note that to make the comparison between the OpenBlas implementation and the standard R math library implementation (which is single threaded) fair, we have forced the OpenBlas to use a single thread. Therefore, even better performance might be expected using a multi-threaded OpenBlas implementation. All neural networks have been estimated by restarting the learning process ten times, to avoid being trapped in local minima (an operation not necessary when using ELMs). In many applications, it is a standard choice to use 50 random restarts, making the advantage in using ELM even more effective. As a further remark, note that when using the ELM with the regularization learning process, the computational advantage is reduced by a factor of 4 on average. However, this is still an important figure. In the NN learning process, the regularization parameter has not been estimated on each dataset but fixed on the base of ad hoc choices. The selection of this tuning parameter for NNs is usually based on cross-validation, making the computational burden heavy and in many applications unfeasible.

Figure 1. NN vs ELM with regularization relative estimation computing time:

$d = 4, 6, 8, 10$ ;

$T = 300,500, 1000, 2000$ ; number of random restarts of NN = 10; maximum number of iteration for NN = 200; ratios of the average estimation time over 100 Monte Carlo runs.

DownLoad: Full-Size Img PowerPoint

Figure 2. NN vs ELM relative estimation computing time:

$d = 4, 6, 8, 10$ ;

$T = 300,500, 1000, 2000$ ; number of random restarts of NN = 10; maximum number of iteration for NN = 200; ratios of the average estimation time over 100 Monte Carlo runs.

DownLoad: Full-Size Img PowerPoint

The computational advantage in using ELM (with or without regularization) in bootstrap schemes is so high that the whole bootstrap distribution for a given statistic of interest could be estimated using basically the same computational time that it is needed for a single neural network estimation (by considering 1000/2000 bootstrap runs to approximate the bootstrap distribution). Moreover, consider that in several applications, the neural network model needs to be re-estimated on each bootstrap run. This is the case, for example, when computing bootstrap prediction intervals, where model re-estimation is needed to incorporate uncertainty due to model estimation. In this latter case, using NNs is almost unfeasible, while using ELMs makes the overall bootstrap procedure possible, with very reasonable computing time.

As a further step in the simulation design, we wish to evaluate and compare the performance of the bootstrap procedure using NNs and ELMs, in order to check if there is any loss in accuracy for the bootstrap inference process. The experimental setup is based on datasets generated by different nonlinear models, with different degrees on nonlinearity.

We consider the class of STAR models as specified in ^[45]:

$Y_{t} = \phi_{1}Y_{t-1} -(\phi_{1} - \phi_{2})F\left(Y_{t-1}, \gamma \right)Y_{{t-1}} + \varepsilon_{t}$

with $\varepsilon_{t}\sim N(0, 1)$ . The function $F$ determines different transition processes. If $F\left(u, \gamma\right) = 1-\exp(-\gamma u^{2})$ we get exponential STAR model (ESTAR) while using $F\left(u, \gamma\right) = \frac{1}{1+\exp(-\gamma u)}$ we get a logistic smooth transition model (LSTAR). These models, very popular in several applications in different fields, have been chosen as representing different kind of dynamic behaviour, since their flexibility allows generation of quite different time series structures.

The degree of non-linearity in the LSTAR/ESTAR models is controlled by the parameter $\gamma$ in the transition function. When $\gamma \to 0$ , the transition function tends towards 0, and the model will be a simple autoregressive process. When $\gamma \to \infty$ , the transition function converges towards unity, which implies that the model is a different autoregressive model with coefficients equal to the mean of the autoregressive parameters of the two regimes.

The parameters $\phi_{1}$ , $\phi_{2}$ and $\gamma$ have been fixed according to the values in Table 3, leading to four different specifications for each class of models (denoted as v1, v2, v3 and v4).

Table 3. Parameter values for the LSTAR/ESTAR models chosen according to the combinations v1, v2, v3 and v4.

	v1	v2	v3	v4
$\phi_{1}$	0.1	0.6	0.1	0.6
$\phi_{2}$	-0.1	-0.6	-0.1	-0.6
$\gamma$	5	5	25	25

| Show Table

DownLoad: CSV

In order to evaluate the distribution of either linear estimators or simple nonlinear estimators, we consider, for all the models, four statistics: the sample mean, the sample median, the sample variance and the sample autocovariance (at lag 1). Let $\hat \theta$ be the statistic of interest and let $\mathcal{R}^{*} = (\hat \theta ^{*} - \mathbb{E}_{*}[\hat \theta ^{*}])/\sigma_{T}$ the bootstrap counterpart of the root $\mathcal{R} = (\hat \theta - \mathbb{E}[\hat \theta])/\sigma_{T}$ where $\sigma_{T} = \sqrt{{\rm Var}(\hat \theta)}$ denotes the true standard deviation. The true standard errors $\sigma_{t}$ have been estimated by using a Monte Carlo simulation with 100,000 runs. The distribution of $\mathcal{R}^{*}$ has been generated by using NAR-Sieve bootstrap based on NNs, ELMs and regularized ELMs.

All simulations are based on $N = 500$ Monte Carlo runs with time series of length $T$ with $T \in \{300,500, 1000, 2000\}$ . The bootstrap distributions have been estimated using $B = 1,000$ bootstrap replicates. The final experiment is based on 384 design points (2 classes of models, 4 model specifications, 4 statistics, 4 different time series length and three different bootstrap implementations).

The results are reported in Figures 3 and 4. For all the statistics considered, the performance of the bootstrap based on ELMs is comparable with that of the bootstrap based on NNs or, even, slightly better in some cases. For linear functionals, such as the mean, the performance of the three methods is very close, with low variability, showing excellent accuracy when using the bootstrap to estimate the true unknown standard error. For nonlinear functionals, such as the median, the results are very similar and remain stable for the two class of models and different degrees of nonlinearity. For the case of complex functionals, which can be expressed as a function of means such as the variance and the covariance, the results remain consistent, but they show a lower accuracy for shorter time series.

Figure 3. LSTAR model. Variance estimation of

$(\hat \theta - \mathbb{E}[\hat \theta])/\sigma_{T}$ by

$(\hat \theta ^{*} - \mathbb{E}_{*}[\hat \theta ^{*}])/\sigma_{T}$ [with

$\sigma_{T} = \sqrt{{\rm Var}(\hat \theta)}$ ] for

$\hat \theta = \{\rm mean, median, var, cov\}$ . Boxplots for NN-Sieve (NN), NN-sieve with ELM (ELM) and NN-sieve with ELM Ridge (Reg), with target indicated by the horizontal line. 500 simulation runs, 1000 bootstrap replicates per simulation run.

DownLoad: Full-Size Img PowerPoint

Figure 4. ESTAR model. Variance estimation of

$(\hat \theta - \mathbb{E}[\hat \theta])/\sigma_{T}$ by

$(\hat \theta ^{*} - \mathbb{E}_{*}[\hat \theta ^{*}])/\sigma_{T}$ [with

$\sigma_{T} = \sqrt{{\rm Var}(\hat \theta)}$ ] for

DownLoad: Full-Size Img PowerPoint

Moreover, the degree of nonlinearity of the models might reduce the accuracy of the bootstrap standard error estimation. However, ELMs appear to deliver better results in these latter cases both in terms of accuracy and bias. Anyhow, using ELMs requires only a fraction of the computational time needed for NNs. Finally, all estimators appear to be consistent, with an apparent convergence to the reference value and a decreasing variability, while the time series length increases.

6. An application to real data

As an application to real data, we consider the normalized tree-ring widths in dimensionless units. The data were recorded by Donald A. Graybill, 1980, from Gt Basin Bristlecone Pine 2805M, 3726–11810 in Methuselah Walk, California. It is a univariate time series with 7981 yearly observations from 6000 BC to 1979. Tree-ring data are of great importance in ecology in general and in climate change studies in particular. Other fields of interest include archaeology (for dating materials and artefacts made from wood), chemists (where tree rings based methods are used to calibrate radiocarbon dates) and dendrology (which also includes forestry management and conservation).

In this application, we limit the analysis to the period ranging from 1001 to 1979 (about the last 1000 years). The time plot of the data and its autocorrelation function (for the first 20 lags) are reported in Figure 5. The data generating process appears to be stationary with a decreasing autocorrelation function where the first five lags are statistically significant at the level of 5%. To gain inference on the true autocorrelations, given the observed time series, the sampling distribution of the estimates is derived by using the sieve bootstrap based both on NNs and ELMs. The kernel density estimation of the bootstrap distributions for the first six lags is reported in Figure 6. Clearly, the bootstrap estimates seem to be able to capture the asymmetry of the true sampling distribution of autocorrelations. There is a slight difference between NN and ELM sieve bootstrap for the first lag, but in all other cases, the two approaches considered in the paper appear to deliver similar results.

Figure 5. Tree-ring width yearly time series (on the left) from 1001 to 1979 and ACF (on the right) for the first 20 lags.

DownLoad: Full-Size Img PowerPoint

Figure 6. Kernel densities estimation of the bootstrap distributions (bootstrap runs = 1,999) obtained by using the sieve bootstrap based on neural networks (solid line) and extreme learning machines (dashed line) for the first 6 lags.

DownLoad: Full-Size Img PowerPoint

Given the bootstrap sampling distribution, confidence intervals with nominal level equal to 95% are derived and plotted in Figure 7 for the first 20 lags. Both procedures identify the first four lags as significant, while all other lags cannot be considered different from zero at the given confidence level. The different conclusion on lag five can be explained by the higher accuracy of the bootstrap-based inference with respect to the normal approximation used to construct the confidence intervals reported in Figure 5.

Figure 7. Percentile bootstrap confidence intervals (nominal level = 0.95, bootstrap runs = 1,999) with sieve bootstrap based on neural networks (on the left) and based on extreme learning machines (on the right) for the first 20 lags.

DownLoad: Full-Size Img PowerPoint

7. Concluding remarks

In this paper, a novel nonlinear autoregressive sieve bootstrap scheme based on the use of ELMs have been proposed and discussed. To evaluate the performance of the proposed approach, a Monte Carlo simulation experiment has been implemented. Bootstrap schemes based on neural networks appear to be an encouraging solution to extend sieve bootstrap techniques to nonlinear time series. In this framework, ELMs can dramatically reduce the computational burden of the overall procedure, with performances comparable to the NN-Sieve bootstrap and computing time comparable to the AR-Sieve bootstrap.

Moreover, alternative algorithms for ELMs estimation can be considered. Although the orthogonal projection method can be efficiently used to calculate the Moore-Penrose inverse and the solution can be easily and fast obtained, regularized versions of least squares have proved to be stabler. These techniques regularize the coefficients (controlling how large they grow) by penalizing their magnitude along with minimizing the error between predicted and actual observations. Adding a penalty term can improve the stability of ELMs reducing the variance of the estimates. Even when using the regularized versions based on the Tsybakov ( $L_{2}$ ) regularization the computational advantage of ELMs over classical NNs is maintained. Extensions of the bootstrap resampling scheme by using $L_{1}$ (such the LASSO) or a combination of $L_{1}$ and $L_{2}$ regularization (such as the elastic net) is still under study.

The performance of the sieve bootstrap based on ELMs depends on the Basic Linear Algebra System (BLAS) implementation. Using OpenBlas in place of the standard R library (even in single thread mode) can significantly improve the performance with better-scaling properties for bigger size problems. Other BLAS implementations that could be considered are the Intel's Math Kernel Library (MKL) and the ATLAS library which is a general purpose tunable library.

However, several different aspects should be further explored through a more extensive simulation study. These aspects include the sensitivity and the stability of the NAR-Sieve bootstrap to lag structure misspecification and to the choice of the hidden layer size. These topics are still under investigation and out of the scope of this paper. As a final remark, note that ELMs have been extended to deal with large size data problems effectively. The feasibility of bootstrap resampling schemes in this framework is still under investigation

Acknowledgements

The authors wish to thank the Associate Editor and the anonymous referees for their helpful comments and suggestions.

Conflict of interest

The authors declare no conflict of interest.

References

[1]	S. T. Liang, L. T. Liang, J. M. Rosen, COVID-19: A comparison to the 1918 influenza and how we can defeat it, Postgrad Med. J., 97 (2021), 273–274. https://doi.org/10.1136/postgradmedj-2020-139070 doi: 10.1136/postgradmedj-2020-139070
[2]	Worldometer, COVID-19 coronavirus pandemic, available from: https://www.worldometers.info/coronavirus/ (Accessed May 12, 2022).
[3]	E. Dong, H. Du, L. Gardner, An interactive web-based dashboard to track COVID-19 in real time, Lancet Infect. Dis., 20 (2020), 533–534. https://doi.org/10.1016/S1473-3099(20)30120-1 doi: 10.1016/S1473-3099(20)30120-1
[4]	C. N. Ngonghala, E. Iboi, S. Eikenberry, M. Scotch, C. R. MacIntyre, M. H. Bonds, et al., Mathematical assessment of the impact of non-pharmaceutical interventions on curtailing the 2019 novel coronavirus, Math. Biosci., 325 (2020), 108364. https://doi.org/10.1016/j.mbs.2020.108364 doi: 10.1016/j.mbs.2020.108364
[5]	C. N. Ngonghala, E. A. Iboi, A. B. Gumel, Could masks curtail the post-lockdown resurgence of COVID-19 in the US?, Math. Biosci., 329 (2020), 108452. https://doi.org/10.1016/j.mbs.2020.108452 doi: 10.1016/j.mbs.2020.108452
[6]	C. N. Ngonghala, P. Goel, D. Kutor, S. Bhattacharyya, Human choice to self-isolate in the face of the Covid-19 pandemic: a game dynamic modelling approach, J. Theor. Biol., 521 (2021), 110692. https://doi.org/10.1016/j.jtbi.2021.110692 doi: 10.1016/j.jtbi.2021.110692
[7]	S. E. Eikenberry, M. Mancuso, E. Iboi, T. Phan, K. Eikenberry, Y. Kuang, et al., To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic, Infect. Dis. Model., 5 (2020), 293–308. https://doi.org/10.1016/j.idm.2020.04.001 doi: 10.1016/j.idm.2020.04.001
[8]	C. N. Ngonghala, J. R. Knitter, L. Marinacci, M. H. Bonds, A. B. Gumel, Assessing the impact of widespread respirator use in curtailing COVID-19 transmission in the USA, Roy. Soc. Open Sci., 8 (2021), 210699. https://doi.org/10.1098/rsos.210699 doi: 10.1098/rsos.210699
[9]	Pfizer, Pfizer and Biontech to submit emergency use authorization request today to the US FDA for COVID-19 vaccine, 2020.
[10]	US Food and Drug Administration, FDA briefing document, in: Oncology Drug Advisory Committee Meeting, Silver Spring, MD, 2009.
[11]	E. Mahase, COVID-19: Moderna vaccine is nearly 95% effective, trial involving high risk and elderly people shows, BMJ- Brit. Med. J., 371 (2020), m4471.
[12]	W. H. Self, M. W. Tenforde, J. P. Rhoads, M. Gaglani, A. A. Ginde, D. J. Douin, et al., Comparative effectiveness of Moderna, Pfizer-Biontech, and Janssen (Johnson & Johnson) vaccines in preventing COVID-19 hospitalizations among adults without immunocompromising conditions—United States, March-August 2021, Morb. Mort. Wkly Rep., 70 (2021), 1337–1343. https://doi.org/10.15585/mmwr.mm7038e1 doi: 10.15585/mmwr.mm7038e1
[13]	US Food and Drug Administration, FDA issues emergency use authorization for third COVID-19 vaccine, FSA News Release, 2021.
[14]	J. Sargent, S. Kumar, K. Buckley, J. McIntyre, Johnson & Johnson announces real-world evidence and phase 3 data confirming substantial protection of single-shot COVID-19 vaccine in the US additional data show a booster increases protection1, 2021.
[15]	F. P. Polack, S. J. Thomas, N. Kitchin, J. Absalon, A. Gurtman, S. Lockhart, et al., Safety and efficacy of the BNT162b2 mRNA COVID-19 vaccine, N. Engl. J. Med., 383 (2020), 2603–2615. https://doi.org/10.1056/NEJMoa2034577 doi: 10.1056/NEJMoa2034577
[16]	Y. M. Bar-On, Y. Goldberg, M. Mandel, O. Bodenheimer, L. Freedman, N. Kalkstein, et al., Protection of BNT162b2 vaccine booster against COVID-19 in Israel, N. Engl. J. Med., 385 (2021), 1393–1400. https://doi.org/10.1056/NEJMoa2114255 doi: 10.1056/NEJMoa2114255
[17]	E. Mahase, COVID-19: What new variants are emerging and how are they being investigated?, BMJ-Brit. Med. J., 372 (2021), n158. https://doi.org/10.1136/bmj.n158 doi: 10.1136/bmj.n158
[18]	A. Gómez-Carballa, J. Pardo-Seco, X. Bello, F. Martinón-Torres, A. Salas, Superspreading in the emergence of covid-19 variants, Trends Genet., 37 (2021), 1069–1080. https://doi.org/10.1016/j.tig.2021.09.003 doi: 10.1016/j.tig.2021.09.003
[19]	S. S. A. Karim, Q. A. Karim, Omicron Sars-Cov-2 variant: a new chapter in the COVID-19 pandemic, The Lancet, 398 (2021), 2126–2128. https://doi.org/10.1016/S0140-6736(21)02758-6 doi: 10.1016/S0140-6736(21)02758-6
[20]	D. Duong, What's important to know about the new COVID-19 variants?, CMAJ: Can. Med. Assoc. J., 193 (2021), E141–E142. https://doi.org/10.1503/cmaj.1095915 doi: 10.1503/cmaj.1095915
[21]	T. Koyama, D. Weeraratne, J. L. Snowdon, L. Parida, Emergence of drift variants that may affect COVID-19 vaccine development and antibody treatment, Pathogens, 9 (2020), 324. https://doi.org/10.3390/pathogens9050324 doi: 10.3390/pathogens9050324
[22]	C. Del Rio, S. B. Omer, P. N. Malani, Winter of omicron—the evolving COVID-19 pandemic, JAMA, 327 (2022), 319–320. https://doi.org/10.1001/jama.2021.24315 doi: 10.1001/jama.2021.24315
[23]	E. Callaway, H. Ledford, How bad is Omicron? what scientists know so far, Nature, 600 (2021), 197–199. https://doi.org/10.1038/d41586-021-03614-z doi: 10.1038/d41586-021-03614-z
[24]	Center for Disease Control and Prevention, Omicron Variant: What You Need to Know, available from: https://www.cdc.gov/coronavirus/2019-ncov/variants/omicron-variant.html#., (Accessed May 09, 2022).
[25]	F. Rahimi, A. T. B. Abadi, The Omicron subvariant BA. 2: Birth of a new challenge during the COVID-19 pandemic, Int. J. Surg., 99 (2022), 106261. https://doi.org/10.1016/j.ijsu.2022.106261 doi: 10.1016/j.ijsu.2022.106261
[26]	K. Katella, Omicron and the BA.2 Subvariant: A Guide to What We Know, available from: https://www.yalemedicine.org/news/5-things-to-know-omicron, (Accessed May 09, 2022).
[27]	C. N. Ngonghala, H. B. Taboe, S. Safdar, A. B. Gumel, Unraveling the dynamics of the Omicron and Delta variants of the 2019 coronavirus in the presence of vaccination, mask usage, and antiviral treatment, medRxiv, (2022), 2022.02.23.22271394. https://doi.org/10.1101/2022.02.23.22271394
[28]	A. B. Gumel, E. A. Iboi, C. N. Ngonghala, G. A. Ngwa, Toward achieving a vaccine-derived herd immunity threshold for COVID-19 in the US, Front. Public Health, 9 (2021), 709369. https://doi.org/10.3389/fpubh.2021.709369 doi: 10.3389/fpubh.2021.709369
[29]	H. E. Fast, E. Zell, B. P. Murthy, N. Murthy, L. Meng, L. G. Scharf, et al., Booster and additional primary dose COVID-19 vaccinations among adults aged $\ge$ 65 years—United States, August 13, 2021–November 19, 2021, Morb. Mortal. Wkly Rep., 70 (2021), 1735. https://doi.org/10.15585/mmwr.mm7050e2 doi: 10.15585/mmwr.mm7050e2
[30]	E. A. Iboi, C. N. Ngonghala, A. B. Gumel, Will an imperfect vaccine curtail the COVID-19 pandemic in the US?, Infect. Dis. Model., 5 (2020), 510–524. https://doi.org/10.1016/j.idm.2020.07.006 doi: 10.1016/j.idm.2020.07.006
[31]	A. B. Gumel, E. A. Iboi, C. N. Ngonghala, E. H. Elbasha, A primer on using mathematics to understand Covid-19 dynamics: Modeling, analysis and simulations, Infect. Dis. Model., 6 (2020), 148–168. https://doi.org/10.1016/j.idm.2020.11.005 doi: 10.1016/j.idm.2020.11.005
[32]	H. B. Taboe, M. Asare-Baah, A. Yesmin, C. N. Ngonghala, Impact of age structure and vaccine prioritization on COVID-19 in West Africa, Infect. Dis. Model., (2022). https://doi.org/10.1016/j.idm.2022.08.006
[33]	C. N. Ngonghala, A. B. Gumel, Mathematical assessment of the role of vaccination against COVID-19 in the United States, in Mathematical Modeling, Simulations, and AI for Emergent Pandemic Diseases: Lessons Learned from COVID-19 (eds. Jorge X. Velasco Hernández and Esteban A. Hernandez-Vargas), Elsevier, (2022), 1–30.
[34]	S. A. Rella, Y. A. Kulikova, E. T. Dermitzakis, F. A. Kondrashov, Rates of SARS-Cov-2 transmission and vaccination impact the fate of vaccine-resistant strains, Sci. Rep., 11 (2021), 1–10. https://doi.org/10.1038/s41598-021-95025-3 doi: 10.1038/s41598-021-95025-3
[35]	B. Curley, How long does immunity from COVID-19 vaccination last?, Healthline, (Accessed on July 25, 2021).
[36]	M. Mrityunjaya, V. Pavithra, R. Neelam, P. Janhavi, P. Halami, P. Ravindra, Immune-boosting, antioxidant and anti-inflammatory food supplements targeting pathogenesis of COVID-19, Front. Immunol., 11 (2020), 570122. https://doi.org/10.3389/fimmu.2020.570122 doi: 10.3389/fimmu.2020.570122
[37]	M. Alagawany, Y. A. Attia, M. R. Farag, S. S. Elnesr, S. A. Nagadi, M. E. Shafi, et al., The strategy of boosting the immune system under the COVID-19 pandemic, Front. Vet. Sci., (2021), 712. https://doi.org/10.3389/fvets.2020.570748 doi: 10.3389/fvets.2020.570748
[38]	Food and Drug Administration, FDA briefing document, Pfizer-Biontech COVID-19 vaccine, in: Vaccines and Related Biological Products Advisory Committee Meeting, 2020.
[39]	S. E. Oliver, J. W. Gargano, M. Marin, M. Wallace, K. G. Curran, et al., The Advisory Committee on Immunization Practices' interim recommendation for use of Pfizer-Biontech COVID-19 vaccine - United States, December 2020, Morb. Mortal. Wkly Rep., 69 (2020), 1922–1924. https://doi.org/10.15585/mmwr.mm6950e2 doi: 10.15585/mmwr.mm6950e2
[40]	US Food and Drug Administration and others, Coronavirus (COVID-19) update: FDA issues policies to guide medical product developers addressing virus variants, FDA. February 23, 2021.
[41]	L. Childs, D. W. Dick, Z. Feng, J. M. Heffernan, J. Li, G. Röst, Modeling waning and boosting of covid-19 in canada with vaccination, Epidemics, (2022), 100583. https://doi.org/10.1016/j.epidem.2022.100583 doi: 10.1016/j.epidem.2022.100583
[42]	Centers for Disease Control and Prevention, CDC expands eligibility for COVID-19 booster shots to all adults, 2021.
[43]	W. Pacific, S. A. W. Hasan, Interim statement on booster doses for COVID-19 vaccination, Update, 4 (2021).
[44]	V. Lakshmikantham, A. Vatsala, Theory of differential and integral inequalities with initial time difference and applications, in: Analytic and Geometric Inequalities and Applications, Springer, Dordrecht. 1999, pp. 191–203. https://doi.org/10.1007/978-94-011-4577-0
[45]	H. W. Hethcote, The mathematics of infectious diseases, SIAM Rev., 42 (2000), 599–653. https://doi.org/10.1137/S0036144500371907 doi: 10.1137/S0036144500371907
[46]	P. Van den Driessche, J. Watmough, Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission, Math. Biosci., 180 (2002), 29–48. https://doi.org/10.1016/S0025-5564(02)00108-6 doi: 10.1016/S0025-5564(02)00108-6
[47]	O. Diekmann, J. A. P. Heesterbeek, J. A. Metz, On the definition and the computation of the basic reproduction ratio R0 in models for infectious diseases in heterogeneous populations, J. Math. Biol., 28 (1990), 365–382. https://doi.org/10.1007/BF00178324 doi: 10.1007/BF00178324
[48]	V. Lakshmikantham, S. Leela, A. A. Martynyuk, Stability Analysis of Nonlinear Systems, Springer, 1989.
[49]	A. B. Gumel, C. C. McCluskey, P. van den Driessche, Mathematical study of a staged-progression hiv model with imperfect vaccine, Bull. Math. Biol., 68 (2006), 2105–2128. https://doi.org/10.1007/s11538-006-9095-7 doi: 10.1007/s11538-006-9095-7
[50]	R. M. Anderson, The concept of herd immunity and the design of community-based immunization programmes, Vaccine, 10 (1992), 928–935. https://doi.org/10.1016/0264-410X(92)90327-G doi: 10.1016/0264-410X(92)90327-G
[51]	R. M. Anderson, R. M. May, Vaccination and herd immunity to infectious diseases, Nature, 318 (1985), 323–329. https://doi.org/10.1038/318323a0 doi: 10.1038/318323a0
[52]	S. Pearson, What is the difference between the pfizer, moderna, and johnson & johnson covid-19 vaccines?, GoodRx (Accessed on June 25, 2021) (2021).
[53]	M. Mancuso, S. E. Eikenberry, A. B. Gumel, Will vaccine-derived protective immunity curtail covid-19 variants in the US?, Infect. Dis. Model., 6 (2021), 1110–1134. https://doi.org/10.1016/j.idm.2021.08.008 doi: 10.1016/j.idm.2021.08.008
[54]	Center for Disease Control and Prevention, It's Time for a Boost, available from: https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/past-reports/05202022.html, (Accessed July 08, 2022).
[55]	D.-Y. Lin, Y. Gu, B. Wheeler, H. Young, S. Holloway, S.-K. Sunny, et al., Effectiveness of COVID-19 vaccines over a 9-month period in North Carolina, N. Engl. J. Med., 386 (2022), 933–941. https://doi.org/10.1056/NEJMoa2117128 doi: 10.1056/NEJMoa2117128
[56]	N. Andrews, J. Stowe, F. Kirsebom, S. Toffa, R. Sachdeva, C. Gower, et al., Effectiveness of COVID-19 booster vaccines against COVID-19-related symptoms, hospitalization and death in England, Nat. Med., 28 (2022), 831–837. https://doi.org/10.1038/s41591-022-01699-1 doi: 10.1038/s41591-022-01699-1
[57]	S. M. Sidik, Vaccines protect against infection from Omicron subvariant-but not for long, Nature, 2022 Mar. https://doi.org/10.1038/d41586-022-00775-3
[58]	S. H. Tan, A. R. Cook, D. Heng, B. Ong, D. C. Lye, K. B. Tan, Effectiveness of BNT162b2 vaccine against Omicron in children 5 to 11 years of age, N. Engl. J. Med., 387 (2022), 525–532. https://doi.org/10.1056/NEJMoa2203209 doi: 10.1056/NEJMoa2203209
[59]	R. Grewal, S. A. Kitchen, L. Nguyen, S. A. Buchan, S. E. Wilson, A. P. Costa, et al., Effectiveness of a fourth dose of COVID-19 mRNA vaccine against the Omicron variant among long term care residents in Ontario, Canada: test negative design study, BMJ, (2022), e071502. https://doi.org/10.1136/bmj-2022-071502
[60]	L. Jansen, B. Tegomoh, K. Lange, K. Showalter, J. Figliomeni, B. Abdalhamid, et al., Investigation of a SARS-Cov-2 B. 1.1. 529 (Omicron) variant cluster—Nebraska, November–December 2021, Morb. Mortal. Wkly Rep., 70 (2021), 1782–1784. https://doi.org/10.15585/mmwr.mm705152e3 doi: 10.15585/mmwr.mm705152e3
[61]	B. Curley, "How long does immunity from COVID-19 vaccination last?"Healthline, available from: https://www.healthline.com/health-news/how-long-does-immunity-from-covid-19-vaccination-last, (Accessed March 22, 2022).
[62]	N. M. Linton, T. Kobayashi, Y. Yang, K. Hayashi, A. R. Akhmetzhanov, S. Jung, et al., Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data, J. Clin. Med., 9 (2020), 538. https://doi.org/10.3390/jcm9020538 doi: 10.3390/jcm9020538
[63]	K. Weintraub, Enormous spread of Omicron may bring 140M new Covid infections to US in the next two months, model predicts, available from: https://www.wusa9.com/article/news/verify/how-long-does-it-take-for-the-vaccine-booster-to-get-to-full-protection/65-aa7344c2-fcd5-4c70-bbcd-046e9f697be7, (Accessed March 22, 2022).
[64]	M. Gregory, M. Salenetri, How long does immunity from COVID-19 vaccination, available from: https://www.wusa9.com/article/news/verify/how-long-does-it-take-for-the-vaccine-booster-to-get-to-full-protection/65-aa7344c2-fcd5-4c70-bbcd-046e9f697be7, (Accessed March 22, 2022).
[65]	M. G. Thompson, Effectiveness of a third dose of mRNA vaccines against COVID-19–associated emergency department and urgent care encounters and hospitalizations among adults during periods of Delta and Omicron variant predominance—VISION Network, 10 States, August 2021–January 2022, Morb. Mortal. Wkly Rep., 71 (2022), 139–145. https://doi.org/10.15585/mmwr.mm7104e3 doi: 10.15585/mmwr.mm7104e3
[66]	J. Bosman, J. Hoffman, M. Sanger-Katz, T. Arango, Who are the unvaccinated in America? there's no one answer, The New York Times, 2021.
[67]	J. K. Tan, D. Leong, H. Munusamy, N. H. Zenol Ariffin, N. Kori, R. Hod, et al., The prevalence and clinical significance of Presymptomatic COVID-19 patients: how we can be one step ahead in mitigating a deadly pandemic, BMC Infect. Dis., 21 (2021), 1–10. https://doi.org/10.1186/s12879-021-05849-7 doi: 10.1186/s12879-021-05849-7
[68]	S. Desmon, COVID and the Heart: It Spares No One, available from: https://publichealth.jhu.edu/2022/covid-and-the-heart-it-spares-no-one, (Accessed August 30, 2022).
[69]	V. Thakur, R. K. Ratho, Omicron (b. 1.1. 529): A new SARS-CoV-2 variant of concern mounting worldwide fear, J. Med. Virol., 94 (2022), 1821–1824. https://doi.org/10.1002/jmv.27541 doi: 10.1002/jmv.27541
[70]	J. M. Dan, J. Mateus, Y. Kato, K. M. Hastie, E. D. Yu, C. E. Faliti, et al., Immunological memory to SARS-Cov-2 assessed for up to 8 months after infection, Science, 371 (2021), eabf4063. https://doi.org/10.1126/science.abf4063 doi: 10.1126/science.abf4063
[71]	J. M. Ferdinands, S. Rao, B. E. Dixon, P. K. Mitchell, M. B. DeSilva, S. A. Irving, et al., Waning 2-dose and 3-dose effectiveness of mRNA vaccines against COVID-19–associated emergency department and urgent care encounters and hospitalizations among adults during periods of Delta and Omicron variant predominance—vision network, 10 states, August 2021–January 2022, Morb. Mortal. Wkly Rep., 71 (2022), 255–263. https://doi.org/10.15585/mmwr.mm7107e2 doi: 10.15585/mmwr.mm7107e2
[72]	Z. Zhongming, L. Linong, Y. Xiaona, Z. Wangqiang, L. Wei, Omicron largely evades immunity from past infection or two vaccine doses, 2021.
[73]	P. Elliott, O. Eales, B. Bodinier, D. Tang, H. Wang, J. Jonnerby, et al., Dynamics of a national Omicron SARS-CoV-2 epidemic during {J}anuary 2022 in England, Nat. Commun., 13 (2022), 1–10. https://doi.org/10.1038/s41467-022-32121-6 doi: 10.1038/s41467-022-32121-6
[74]	P. Elliott, O. Eales, N. Steyn, D. Tang, B. Bodinier, H. Wang, et al., Twin peaks: the Omicron SARS-CoV-2 BA. 1 and BA. 2 epidemics in England, Science, (2022), eabq4411. https://doi.org/10.1126/science.abq4411
[75]	D. Kim, S. T. Ali, S. Kim, J. Jo, J.-S. Lim, S. Lee, et al., Estimation of serial interval and reproduction number to quantify the transmissibility of SARS-CoV-2 Omicron variant in South Korea, Viruses, 14 (2022), 533. https://doi.org/10.3390/v14030533 doi: 10.3390/v14030533
[76]	H. F. Tseng, B. K. Ackerson, Y. Luo, L. S. Sy, C. A. Talarico, Y. Tian, et al., Effectiveness of mRNA-1273 against SARS-CoV-2 Omicron and Delta variants, Nat. Med., 28 (2022), 1063–1071. https://doi.org/10.1038/s41591-022-01753-y doi: 10.1038/s41591-022-01753-y
[77]	H. Chemaitelly, H. H. Ayoub, S. AlMukdad, P. Coyle, P. Tang, H. M. Yassine, et al., Duration of mRNA vaccine protection against SARS-CoV-2 Omicron BA. 1 and BA. 2 subvariants in Qatar, Nat. Commun., 13 (2022), 3082. https://doi.org/10.1038/s41467-022-30895-3 doi: 10.1038/s41467-022-30895-3

This article has been cited by:

1.	Lukasz Przepiorka, Sławomir Kujawski, Katarzyna Wójtowicz, Edyta Maj, Andrzej Marchel, Przemysław Kunert, Development and application of explainable artificial intelligence using machine learning classification for long-term facial nerve function after vestibular schwannoma surgery, 2024, 0167-594X, 10.1007/s11060-024-04844-7
2.	José Alberto Benítez-Andrades, Camino Prada-García, Nicolás Ordás-Reyes, Marta Esteban Blanco, Alicia Merayo, Antonio Serrano-García, Enhanced prediction of spine surgery outcomes using advanced machine learning techniques and oversampling methods, 2025, 13, 2047-2501, 10.1007/s13755-025-00343-9

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)