1.
Introduction
In December 2019, a novel coronavirus (COVID-19) was discovered [1]. COVID-19 is a respiratory infectious disease caused by the severe acute respiratory syndrome coronavirus, transmitted by contact, aerosols and inhalation of virus-infected droplets [2,3]. Unfortunately, there were no effective drugs to treat the disease in 2021 [1]. Therefore, the control protocols were mainly physical isolation, such as quarantine, contact tracing and social lockdown [4,5,6,7,8]. Therefore, it is crucial to study the effectiveness of control measures for this disease.
The basic reproduction number R0 is a critical parameter in the analysis of infectious diseases. It measures the average number of secondary infections caused by a typical infectious individual in a fully susceptible population. The disease will not cause an epidemic if R0<1 [11,12,13]. As the epidemic progresses, or when control measures such as social distancing and vaccination are implemented, susceptible individuals are depleted and some of the contacts of the infectious individuals are made to already infected individuals, which do not cause transmission. In this case, the threshold for disease spread is measured by the effective reproduction number Rt, which measures the number of secondary infections caused by a typical infectious individual in the current population (with some already infected individuals). Its value is usually calculated as the product of R0 and the average population susceptibility [2,4,11]. In the initial stage of disease when the number of infected individuals is only a tiny fraction of the total population, Rt≈R0. Wallinga and Teunis [9] proposed a method to approximate the effective reproduction number, which studied the average number of reproduction for patients who are infected on a given day. Note that this is not the same as the number of infections caused by a patient on a given day. In addition, this method requires the time of infection of patients, which is difficult to trace. Cori et al. [14] presented a simple method for estimating the effective reproduction number, which is based on the time series of disease occurrence. However, this method's effective reproduction number Rt is delayed significantly. Das [4] presented a method for estimating an approximate Rt by taking into account both the mean generation interval and the instantaneous exponential growth rate. However, the instantaneous exponential growth rate can only be estimated from a long enough time period, preventing this method from detecting sudden changes in Rt. In this paper, we propose a new method to directly calculate the real-time reproduction number through confirmed cases and evaluate the effectiveness of control measures.
Based on reported cases of COVID-19, we use the back-calculating method [15] to obtain the number of incidences and the number of infectious patients on each day and then we use them to estimate the change in reproduction number. Moreover, we derive the impact of control measures from the change in the reproduction numbers.
We establish the model in Section 2 and verify it using simulations in Section 3. In Section 4, we show how we applied the model to British Columbia (BC), Canada to obtain their reproduction number and the impact of control measures on the reproduction number; the results are summarized and future work is discussed in Section 5.
2.
Methods
2.1. Model
In this section, we consider a discrete-time stochastic seir model in a randomly mixed population. Let St, Et and It be the number of susceptible, latent and infectious individuals on day t. Because of the random-mixing assumption, the expected number of new infections on day t is
A newly infected patient goes through a latent period L and becomes infectious. Here L is a discrete random variable with a probability mass function {pi}∞i=0, i.e., the probability that the latent period has a length of i days is pi. This patient then goes through an infectious period X and is diagnosed. Let the probability mass function of X be {qi}∞i=0. Let Qi be the cumulative probability function of X, that is Qi=i∑j=0qj, 0≤i≤∞. We assume that, once diagnosed, the patient is fully isolated and stops being infectious. The course of disease Y is the sum of the latent and infectious periods, i.e., Y=L+X. Let di be the probability mass function of Y. Then
By definition, the basic reproduction number, i.e., the average number of secondary infections caused by a typical infectious individual during the infectious period in a fully susceptible population, is
where E[X]=∞∑i=0iqi is the mean infectious period.
Thus, the effective reproduction number on day t is
Note that the last step is from (1).
Given the mean infectious period E[X], we need to estimate the number of new infections Zt and the number of infectious individuals It so that we can estimate the effective reproduction number Rt.
Note that Zt and mt have the following relationship
With the mt given, we need to solve Zt. Unfortunately, this is a deconvolution problem, and it is difficult to solve (see, e.g., [16]). Instead of solving it, we use the following method to approximate Zt. Suppose that the number of diagnosed cases on day t, namely mt, is observed for days t=0,1,…,T. A patient who is diagnosed on day t+i, 0≤i≤T−t was infected on day t if and only if the serial interval Y=i, i.e.,
Note that (5) does not solve the deconvolution problem given by (4). However, we will show that this can give a good approximation for Zt, especially if mt is approximately exponentially growing or decaying and the change in the exponential growth rate is slow (measured on the time-scale of the mean serial interval). To see this, assume that Zt=Z0μt for a constant μ>0 (the exponential growth rate is thus logμ); then, (4) becomes
where g(x)=∑∞i=0xidi is the probability generating function of the serial interval distribution di. Substitute this into the right hand side of (5) and assume T≫1; then,
Thus, if the change in Zt is slow, i.e., μ≈1, then g(μ)g(1/μ)≈1.
Using a similar approximation, a patient who is diagnosed on day t+i was infectious on day t because the infectious period X≥i. That is,
Thus, from (3),
The symbols used in this article are described in detail in Table 1.
2.2. Confidence interval
Given the mean infectious period E[X], in order to find the 95 confidence interval of Rt, we will get a random sample of Zt and It by using the Monte Carlo method. For the patients who are diagnosed on day t, let ˜Zt,t−i be the number of those who are infected on day t−i for i=0,1,…. Then, the approximation given by (5) is equivalent to the following two steps:
a) Assume that ˜Zt,t−i is multinomially distributed according to
b) The number of individuals who are infected on day t is
Note that the mean of (11) is given by (5).
Similarly, for patients who show symptoms on day t, the number of those who were infected on day t−i is
Thus, the number of individuals who are infected on day t is
Note that, here, Ct is a random variable, the mean of which is given by
which uses a similar approximation as (5).
The expected number of individuals It who are infectious on day t is the total number of people who have become infectious but have not been removed from transmission (via recovery or isolation). Thus, It can be calculated as
where the first term on the right-hand side is the number of patients who have become infectious before (or on) day t, while the second term is the number of patients who have been diagnosed and isolated before (or on) day t.
To generate one sample of Zt and It, for each mt, t=0,1,…,T, we use (12) to generate a sample for ˜Ct,t−i, and then use (13) to calculate Ct. We then use the calculated Ct to generate a sample of Zt using (10) and (11), and we use the calculated Ct to generate a sample of It using (10), (11) and (15). We can then use (3) to calculate a sample of the curve Rt.
We generate 105 samples for Rt. For each t=0,1,…,T, we use these samples to estimate the 95 confidence interval.
3.
Model validation
To verify that our model can correctly estimate the reproduction number, we apply (9) to a dataset generated from stochastic simulations by using the method in Section 2.2 to estimate the 95 confidence interval of the reproduction number.
Non-pharmacological intervention (NPI) measures reduce the transmission rate [17]. The latent period and infectious period are specific to the disease, and are not affected by NPI measures. Thus, we consider the following two cases.
Case 1 Seasonal variation in the transmission rate is sometimes approximated by a sinusoidal function [10,18,19]. Here we assume β to be sinusoidal to verify that our method can detect continuous change in β(t). Specifically,
In this case, the infectious period is assumed to be gamma-distributed with a shape parameter of 3 and a rate parameter of 0.2; the latent period is assumed to be gamma-distributed with a shape parameter of 3 and a rate parameter of 0.3. Note that these choices only serve as a numerical example and are (no comma) not tied to a specific disease.
Case 2 The transmission rate is assumed to be a step function to simulate a sequence of control measures that cause sudden change in β, that is,
In this case, the infectious period is assumed to be gamma-distributed with a shape parameter of 3 and a rate parameter of 0.25; the latent period is assumed to be gamma-distributed with a shape parameter of 3 and a rate parameter of 0.2.
In both cases, the simulated population size is 106. During the simulated time periods, the number of infected individuals is only a small fraction of the population size; thus, S(t)≈1 and the effective reproduction number is approximately the basic reproduction number.
Figure 1 shows the comparison of the estimated reproduction number Rt as a function of time with the true value for Case 1. Figure 2 shows the comparison for Case 2. In both cases, our method can correctly estimate the reproduction number. In addition, these figures also show that the confidence interval narrows with a larger case count.
4.
Application to the COVID-19 outbreak in BC, Canada in 2020
Now that we have validated our method, in this section, we apply the method to study the change of the reproduction number as a function of time for the COVID-19 outbreak in BC, Canada in 2020.
In BC, we consider the following policy changes:
● Provincial state of emergency was declared on March 17;
● Businesses reopened on May 19;
● Provincial state of emergency was declared on July 7;
● Provincial state of emergency was declared on August 5;
● Public K-12 schools reopened on September 10;
● Provincial state of emergency was declared on October 28;
● Provincial state of emergency was declared on December 23.
We used the daily reported case count data for the period of March 1 to December 31, 2020 that were released from the BC Centre for Disease Control (BCCDC) as a spreadsheet. This spreadsheet has been taken offline. However, the data can still be accessed via the COVID-19 dashboard on the BCCDC website [20].
To apply our method, we need to know the latent period distribution and the infectious period distribution. We use the latent period distribution estimated by [21], which is gamma-distributed with a mean of 5.48 days and a standard deviation of 2.72 days.
On the other hand, regional differences in testing policy and human behavior in voluntary testing may affect when a patient is diagnosed and isolated, and, in turn, affect the infectious period. In Subsection 4.1, we estimate the infectious period distribution in BC from the daily number of diagnosed cases and symptom onsets.
4.1. Estimate the infectious period
We assume that the patients will be isolated after being diagnosed. Therefore, the end of their infectious period is marked by diagnosis, not recovery. We assume that the infectious period is gamma-distributed [22,23,24,25,26], with a shape parameter α and a scale parameter ε.
We digitized and tabulated the daily number of symptom onsets for the period of January 15 to June 7, 2020 from the British Columbia COVID-19 Daily Situation Report released on June 9 [27].
Using an approximation similar to (5), the expected number of patients showing symptoms on day t can be calculated from the diagnosed cases on day t+i (mt+i) as
We assume that Ct is the observed symptom onset count on day t; it follows a Poisson distribution with the mean λt, i.e.,
We use the Markov chain Monte Carlo method via the R package "R2jag" to estimate the distribution parameters α and ε. The prior distributions of the parameters are chosen to follow a uniform distribution with wide intervals:
The results are given in Table 2. Figure 3 shows the point estimate and 95 confidence interval of the estimated density function of the infectious period distribution.
4.2. Estimate the effective reproduction number
Using the point estimate of the infectious period distribution in Subsection 4.1, we estimate the reproduction number as a function of time in BC, Canada.
Figure 4 shows both the epidemic curve (reported cases) and the estimated reproduction number. This figure shows that, since the provincial state of emergency on March 17, the reproduction number was controlled to below 1 until the relaxation (business reopening) announced on May 19. The reproduction number then increased gradually after the relaxation to 1.76 in June, being largely maintained until August 1st, at which point it was about 1.60. The strengthening of control measures on August 5 reduced the reproduction number and eventually controlled it to around 1 on September 10. It then increased again to a peak value of 1.67 on October 10. It was then brought back to about unity beginning on November 13.
5.
Concluding remarks
We have developed a novel method to estimate the change of the reproduction number with time. Using simulated data, we have shown that our method can estimate the change in the reproduction number due to either seasonal forcing or control measures. This means that our method is widely applicable to understand the change of the transmission rate.
Applying our method to the COVID-19 outbreak in BC, Canada in 2020 shows that the strengthening of control measures such as social distancing, restricting gathering and closing schools from March 20 to the end of May successfully reduced the reproduction number to below 1, except for a period in early April (may be due to clustered cases in long-term care facilities [28], or the gathering activities of the Easter holiday). However, the reproduction number gradually increased to above 1 after business reopening in May, even though the case counts did not exhibit an immediate increase. This shows that our method is very sensitive as a tool to detect the changes in reproduction number. The same increase also happened after the school reopening in September, which eventually triggered the fast increase of cases in October and early November. Note that, during this time, the variants of concern had not shown up yet, that is, no variants of concern appeared in 2020 [29,30]. Thus, the increase of cases is mostly likely due to the relaxation of control measures.
Not surprisingly, our estimation yields a narrower confidence interval with a larger case count. Our method also relies on reliable estimation of the latent and infectious periods, which may be difficult to estimate during the early stage of a disease outbreak. However, we have also demonstrated that our method can be adapted to estimate the infectious period from the daily counts of symptom onset and diagnosed cases.
Another limitation of our method is that it ignores asymptomatic and pre-symptomatic transmissions, which may be an important factor driving the COVID-19 transmission. However, this may not significantly affect our method if the ratio of asymptomatic cases to all cases remains roughly constant, as the proportional fact is canceled in our formulation.
Our method provides a new tool for analyzing host immunity resulting from the effective vaccinations, with and without NPI measures. At the onset of disease spread, the effective vaccination rate v is the primary factor influencing the number of susceptible individuals. In this case, the effective reproduction number can be expressed as Rt=β(1−v)StE(X), which is similar to (3). If the effective vaccination rate v is known, our method can estimate the temporal changes in β. However, in the absence of information about v, it is only possible to estimate the value of β(1−v), not the individual parameters β and v.
Our method can be used to study other infectious diseases as well. For instance, it can be used to investigate the influence of seasonality on the transmission of seasonal influenza, or to examine the effect of control measures on historical outbreaks, such as pandemic influenza, SARS and Ebola. Furthermore, our method can be applied to the study of vector-borne diseases, including those transmitted by mosquitoes, by extending our model to consider the disease transmission from person to person, with mosquitoes as the vectors. However, obtaining the specific changes in β is challenging, as the infection rate through the vector depends on the change in infected mosquitoes, resulting in a more complex dependence of β on mosquitoes than the simple SEIR model. Therefore, further research is needed to address this complexity. Additionally, the same generalization can be applied to sexually transmitted infections.
Use of AI tools declaration
The authors declare that they have not used artificial intelligence tools in the creation of this article.
Acknowledgments
This research was supported by the National Natural Science Foundation of China (No. 12271088) (ML), the Natural Sciences Foundation of Shanghai (No. 21ZR1401000) (ML) and a discovery grant from the Natural Sciences and Engineering Research Council Canada (JM), as well as two NSERC EIDM grants (OMNI and MfPH) (JM).
Conflict of interest
The authors declare that there is no conflict of interest.