Analyzing the effect of duration on the daily new cases of COVID-19 infections and deaths using bivariate Poisson regression: a marginal conditional approach
The whole world is devastated by the impact of the COVID-19 pandemic. The socioeconomic and other effects of COVID-19 on people are visible in all echelons of society. The main goal of countries is to stop the spreading of this pandemic by reducing the COVID-19 related new cases and deaths. In this paper, we analyzed the correlated count outcomes, daily new cases, and fatalities, and assessed the impact of some covariates by adopting a generalized bivariate Poisson model. There are different effects of duration on new cases and deaths in different countries. Also, the regional variation found to be different, and population density has a significant impact on outcomes.
Citation: Rafiqul Chowdhury, Gary Sneddon, M. Tariqul Hasan. Analyzing the effect of duration on the daily new cases of COVID-19 infections and deaths using bivariate Poisson regression: a marginal conditional approach[J]. Mathematical Biosciences and Engineering, 2020, 17(5): 6085-6097. doi: 10.3934/mbe.2020323
Related Papers:
[1]
Youtian Hao, Guohua Yan, Renjun Ma, M. Tariqul Hasan .
Linking dynamic patterns of COVID-19 spreads in Italy with regional characteristics: a two level longitudinal modelling approach. Mathematical Biosciences and Engineering, 2021, 18(3): 2579-2598.
doi: 10.3934/mbe.2021131
[2]
Marco Roccetti .
Excess mortality and COVID-19 deaths in Italy: A peak comparison study. Mathematical Biosciences and Engineering, 2023, 20(4): 7042-7055.
doi: 10.3934/mbe.2023304
[3]
Ray-Ming Chen .
Extracted features of national and continental daily biweekly growth rates of confirmed COVID-19 cases and deaths via Fourier analysis. Mathematical Biosciences and Engineering, 2021, 18(5): 6216-6238.
doi: 10.3934/mbe.2021311
[4]
Sarah R. Al-Dawsari, Khalaf S. Sultan .
Modeling of daily confirmed Saudi COVID-19 cases using inverted exponential regression. Mathematical Biosciences and Engineering, 2021, 18(3): 2303-2330.
doi: 10.3934/mbe.2021117
[5]
Masoud Saade, Samiran Ghosh, Malay Banerjee, Vitaly Volpert .
An epidemic model with time delays determined by the infectivity and disease durations. Mathematical Biosciences and Engineering, 2023, 20(7): 12864-12888.
doi: 10.3934/mbe.2023574
[6]
Weike Zhou, Aili Wang, Fan Xia, Yanni Xiao, Sanyi Tang .
Effects of media reporting on mitigating spread of COVID-19 in the early phase of the outbreak. Mathematical Biosciences and Engineering, 2020, 17(3): 2693-2707.
doi: 10.3934/mbe.2020147
[7]
Salman Safdar, Calistus N. Ngonghala, Abba B. Gumel .
Mathematical assessment of the role of waning and boosting immunity against the BA.1 Omicron variant in the United States. Mathematical Biosciences and Engineering, 2023, 20(1): 179-212.
doi: 10.3934/mbe.2023009
[8]
Vitaliy Yakovyna, Natalya Shakhovska .
Modelling and predicting the spread of COVID-19 cases depending on restriction policy based on mined recommendation rules. Mathematical Biosciences and Engineering, 2021, 18(3): 2789-2812.
doi: 10.3934/mbe.2021142
[9]
Aili Wang, Xueying Zhang, Rong Yan, Duo Bai, Jingmin He .
Evaluating the impact of multiple factors on the control of COVID-19 epidemic: A modelling analysis using India as a case study. Mathematical Biosciences and Engineering, 2023, 20(4): 6237-6272.
doi: 10.3934/mbe.2023269
[10]
Biplab Dhar, Praveen Kumar Gupta, Mohammad Sajid .
Solution of a dynamical memory effect COVID-19 infection system with leaky vaccination efficacy by non-singular kernel fractional derivatives. Mathematical Biosciences and Engineering, 2022, 19(5): 4341-4367.
doi: 10.3934/mbe.2022201
Abstract
The whole world is devastated by the impact of the COVID-19 pandemic. The socioeconomic and other effects of COVID-19 on people are visible in all echelons of society. The main goal of countries is to stop the spreading of this pandemic by reducing the COVID-19 related new cases and deaths. In this paper, we analyzed the correlated count outcomes, daily new cases, and fatalities, and assessed the impact of some covariates by adopting a generalized bivariate Poisson model. There are different effects of duration on new cases and deaths in different countries. Also, the regional variation found to be different, and population density has a significant impact on outcomes.
1.
Introduction
The world has been devastated by the outbreak of COVID-19 caused by SARS-CoV-2 since January 2020. While many countries could control the spread, however, observing the second wave of recurrence many other countries are witnessing the infection at an alarming rate. The developed, developing, and under-developed countries are facing the unforeseen challenges caused by the COVID-19 pandemic, which took a significant toll on various aspects of life on people all across the world. The growing COVID-19 crisis hit developing countries, not only as a health crisis in the short term but will also exhibit a devastating socio-economic impact over the months and years to come. The level of anxiety in Africa, Europe, Bangladesh, Brazil, India, Iran, the USA, South Korea, and many other countries increased due to new cases and fatalities recently. The socio-economic, psychological, and other impacts have already started [1,2]. It is difficult to diagnose this disease, and we already observed a delay between the onset of symptoms and an accurate diagnosis [3]. Also, there are still many undiagnosed and delayed-diagnosis infections due to a lack of diagnostic kits. Both the undiagnosed and diagnosed infections have a very high ability to transmit the virus to other susceptible people or family members [4,5]. The ability to diagnose and identify the infected persons in time and the treatment have a tremendous impact on daily new cases and deaths. The timely and effective isolation of symptomatic and confirmed COVID-19 cases is crucial for controlling this pandemic. The first wave of this pandemic already proved how it could overwhelm the under-resourced hospitals and fragile health systems in many countries.
The COVID-19 data are now available from various online sources. The majority of them are reporting daily new cases and deaths, along with very few other characteristics. For example, through official communications under the International Health Regulations (IHR, 2005), by monitoring the official ministries of health websites and social media accounts, the World Health Organization (WHO) collected the numbers of confirmed COVID-19 cases and deaths from 31 December 2019 to 21 March 2020 [6]. Since 22 March 2020, daily global data are compiled through WHO region-specific dashboards. Also, Worldometer [7] reports country-specific data along with a few other characteristics. Some sources are also reporting the number of daily tests to detect the infections. These data are mostly used to display daily trends for new cases and deaths–a significant number of research papers have already been published on the COVID-19 pandemic in various journals around the world. The objectives of much of this research are to predict the trends for new cases and deaths and assess the impact of available covariates on new cases and outcomes. Relevant policy makers of countries are using this data to make plans and to reduce the rate of this pandemic. Already, many countries have slowed down the spread of SARS-CoV-2, the virus that causes COVID-19, by taking stringent measures. However, many developing and under-developing countries failed to slow down the infection rate by adopting stringent measures due to local circumstances and culture. Every region in the world needs to make progress against the COVID-19; only then we can be safe. However, globally we are very far from the goal, and the global number of confirmed cases is growing enormously fast.
The daily new cases and deaths, two count outcomes, are naturally correlated and dependent. Analysis of these two dependent count outcomes and assessing the impact of related covariates needs a proper and accurate modeling approach. A marginal-conditional modeling approach for correlated count outcomes will allow us to determine the covariate impact on the responses jointly and better prediction. We employ a generalized linear covariate dependent bivariate Poison model to analyze the dependent daily new cases and death and assessed the impact of some characteristics on the outcomes. The rest of the paper is organized as follows. In Section 2, we present the methodology used in this paper. The results of data analysis are presented in Section 3. Finally, in Section 4 we discuss the conclusions.
2.
Bivariate Poisson regression model
In this section, we discuss the methodology to model the daily new cases of COVID-19 infections and deaths reported by world health organization (WHO). The world health organization regularly updates daily new cases and deaths for each country. It is noteworthy to mention that these daily counts of new cases and deaths are dependent or correlated. Therefore, more new cases can overwhelm the heath care system which in turn can cause more deaths. Moreover, it is vital to study the impact of duration and other available characteristics on these two dependent count outcomes. A bivariate Poisson model can be used to analyze such correlated count outcomes, which will provide in-depth insights and dynamics regarding new cases deaths and covariates.
To analyze these dependent count outcomes along with covariates, we use a bivariate generalized Poisson model [8]. This bivariate generalized Poisson regression model uses a marginal-conditional modeling approach based Poisson-Poisson distribution. Let Yi1 and Yi2 be the count responses of the daily new cases of COVID-19 infections and deaths, respectively, for the ith (i=1,2,…,n) day. It can be shown that the joint distribution of Yi1 and Yi2 is [8]:
where, lnλi1=x′iβ1, and lnλi2=x′iβ2 are link functions. In (2.1), x′i=(1,xi1,⋯,xip) is the vector of the covariates and β1′=(β10,β11,⋯,β1p) and β′2=(β20,β21,⋯,β2p) are the regression coefficients corresponding to the new infections and deaths. It can be shown that the marginal means of Yi1 and Yi2 are E(Yi1)=μi1=λi1 and E(Yi2)=μi2=λi1λi2, respectively. It may be noted that the subscript i of λ is used to represents different subjects due to varying combinations of covariate values.
To estimate the regression parameters, we can take the first and second derivatives of the above log-likelihood function and develop the estimating equations as
and the approximate variance-covariance matrix for ˆβ′=(^β′1,^β′2) is ^Var(ˆβ)=Io−1. Using the Newton-Raphson method, we can obtain the estimated parameters.
Now, using the Poison-multinomial relationship, we can predict the probabilities for bivariate Poison outcomes [9]. For notational convenience, the subscript i is omitted in what follows. To estimate the joint probability, we do it in two steps. First, we need to calculate the marginal probability for each count (m) of outcome Y1(m=0,⋯,k1). Similarly, we need to estimate the conditional probability for each count (s) of Y2(s=0,⋯,k2) for a given value (m) of Y1. The estimate of the marginal probability from the Poisson distribution can be obtained as:
ˆP(Y1=m∣x)=ˆPm=e−ˆμmˆμmm/m!, where k1∑m=0ˆPm=1.
(2.4)
Here, ˆP(Y1=m∣x) is the estimated probability for specific value m for Y1 and for given covariate value x. In other words, we are estimating the probabilities of each value of Y1 by considering all the combinations of different covariate patterns.
For Y1=m, let ym1+⋯+yml+⋯+ymnm=nm, where yml=1 if Y1=m, yml=0 otherwise, m=0,1,⋯,k1, l=1,⋯,nm, and k1∑m=0nm=n. The estimate of Pm is
ˆPm=ˆμmk1∑m=0ˆμm, where ˆμm=nm∑l=1ˆμmlk1∑m=0nm∑l=1ˆμml, and ˆμml=ex′mlβ1.
For the conditional probabilities of Y2=s for any given value of Y1=m the corresponding estimates of multinomial probabilities are
ˆPs∣m=ˆμs∣mk2∑s=0ˆμs∣m,m=0,⋯,k1,s=0,⋯,k2.
where ˆμs∣m=nsm∑h=1ˆμsh∣mk2∑s=0nsm∑h=1ˆμsh∣m, and ˆμsh∣m=ex′sh∣mβ2.
For Y2=s, let ys1∣m+⋯+ysh∣m+⋯+ysnm∣m=nsm, where ysh∣m=1 if Y1=m, Y2=s, ysh∣m=0 otherwise, m=0,1,⋯,k1, h=1,⋯,nsm, and k2∑s=0nsm=nm. For more comprehensive illustrations, readers are referred to [9]. Consequently, the joint probability of Y1=m and Y2=s can be estimated as follows:
ˆP(Y1=m,Y2=s)=ˆP(Y1=m)׈P(Y2=s|Y1=s)=ˆPm׈Ps∣m
(2.5)
We can easily predict the marginal, conditional, and joint probabilities using the fitted marginal and conditional models for new cases with different scenarios of outcomes and covariates. For marginal probabilities, we use the following equation
ˆg(yi1)=e−ˆλi1ˆλyi11/yi1!,
(2.6)
and for conditional probabilities we use
ˆg(yi2∣yi1)=e−ˆλi2yi1(ˆλi2yi1)yi2/yi2!.
(2.7)
Then the joint probability can be estimated using the relation between Poisson and Multinomial described in Eqs (2.4) and (2.5).
3.
Analysis of World Health Organization COVID-19 Data
As of July 7, 2020, the World Health Organization (WHO) reported, globally, there have been 11,500,302 confirmed cases of COVID-19, including 535,759 deaths. We downloaded this map data from the WHO Coronavirus Disease (COVID-19) Dashboard. This data set includes daily new cases and deaths from 215 countries around the world by WHO region. The two-count outcomes, daily new cases and deaths, are defined as Yi1 and Yi2, respectively. The seven WHO regions EURO (European), AFRO (African), AMRO (American), EMRO (Eastern Mediterranean), SEAR (South East Asian), WPRO (Western Pacific), and Other are used as categorical covariates by considering EURO as the reference category. The duration (in days) from the first reporting date along with population density per Km2 are also used as covariates in the bivariate Poisson regression model. To model the non-linear relationship with outcomes and assess the effect of duration more accurately, we also used the squared duration as a covariate in the bivariate Poisson model. The duration ranges from 0 to 176 days, and the density ranges from 0 to 26337 per Km2. The ranges of these variables are vast, with long gaps between the values. For modeling purposes and to avoid convergence problems, we used deciles for each of these covariates. Besides, we recorded the number of daily new cases above 50 as fifty, and the daily number of deaths above 5 as five. Again, the reason is to avoid convergence problems due to the wide variation in the worldwide data. We fitted two models; in model 1, we only included the following covariates as main effect terms,
To analyze the data for this paper, we used the bpglm R package to fit all the models [10].
The analysis results of the proposed bivariate Poisson model with the main effects only (Model 1) are presented in Table 1. Our results in Table 1 indicates that all the WHO regional indicators are showing a statistically significant relationship with daily new cases. Except for the EMRO region, all other areas are showing fewer daily new cases, on average, than the reference EURO region. The EMRO region is showing more daily new cases, on average, than the EURO region. This means the daily new cases of the COVID-19 infections in each region except ERMO are lower as compared to the EURO after the first case was diagnosed. This also suggests that compared to EURO, in the other regions the COVID-19 outbreak spread more slowly but in the ERMO region, infection spread more rapidly. Table 1 indicates that both linear and quadratic terms in duration are statistically significant in the model. The estimates tell us that, with the other variables fixed, that the number of new cases rises as the duration increases up to about 6, then decreases. Our results also indicates that population density has a negative significant effect on the number of cases. This is an indication that under proper safety measures it is possible to control the new cases even in highly populated areas. For the conditional model, with daily death counts as an outcome, except for three WHO region indicators, the other regions had fewer daily deaths, on average, than the EURO region. We are considering the adjusted p-values which adjust for over (or under) dispersion. The other covariates show a significant relationship with the daily number of deaths. However, we believe this may vary for country-specific data.
Table 1.
Estimated coefficients of main effects using bivariate Poisson model.
In Table 2, we present the results using the proposed bivariate Poisson model including the interaction terms as covariates with the main effects (Model 2). In this model, along with the main effects, we included interaction terms between duration and density and the interaction terms between duration and all the WHO region indicators. All the WHO region indicators (main effect terms) showed a significant (p<0.01) reduction of daily new cases compared to the reference indicator EURO, except for Other regions. However, we need to be cautious with the interpretation of these terms because of the inclusion of the interaction terms involving region, which are all significant.
Table 2.
Estimated coefficients of main and interaction effects using bivariate Poisson model.
To compare the performance of the models, we have calculated the log-likelihood, Akaike information criterion (AIC), Bayesian information criterion (BIC) and Deviance using both Model 1 and 2. The results in Table 3 indicate that Model 2 is performing better. Figure 1 displays the predicted conditional probabilities for the given daily number of new cases. The number of new cases varies from 0 to 50 (50 represents fifty or more) and is shown on the x-axis. Each line in the figure represents the number of deaths, which ranges from 0 to 5 (5 represents five or more). The top line shows the probability of no deaths for the number of new cases. The probability shows a steady downward trend and goes close to zero as the number of new cases increases, as one would expect. The solid line for predicted probability for five or more deaths conditional on the number of daily new new cases rises sharply for fifty or more new cases, and the risk is around 0.29. This sharp increase is due to collapsing of 50 or more new cases as fifty to avoid estimation problems. We note that for some days with no new cases, deaths are also reported. Figure 2 presents the trajectories of joint probabilities for new cases and deaths. Overall the risk for both the new cases and deaths is gradually decreasing.
We observed that the effect of duration on both the outcomes of daily new cases and daily deaths are positive for both the models (Tables 1 and 2). We hypothesized that these relationships might not remain the same for country-specific data. The reason is that the developed countries with advanced healthcare systems and with proper planning to handle the COVID-19 pandemic may observe different results. For example, the impact of duration on new cases and deaths may have negative relationships. Also, a longer duration may have a positive impact on daily new cases but a negative effect on daily deaths. However, they may have a positive impact on both the outcomes as we observed in model 1 and model 2.
To assess the country-specific impact of duration on daily new cases and deaths, we fitted separate models for selected countries. We used duration as the only covariate because the WHO region and density for each country are fixed values. Table 3 presents the bivariate Poisson regression results for the selected countries. For China, increased duration decreases new cases and deaths significantly. Similar patterns are found for Myanmar, but it was not significant for the marginal model. In the cases of Malaysia, Sri Lanka, Republic of Korea, Spain, Australia, and Morocco, duration reduces mortality (p<0.01), but it is opposite for new cases. Duration reduces daily new cases (p<0.01), for the African country Tanzania. The majority of countries observed increased in daily new cases and deaths.
From the country-specific analysis, we plotted the predicted risk of the number of deaths for fifty or more new cases for the selected countries (Figure 3 to Figure 6). The x-axis represents the number of deaths, and the predicted risks are on the y-axis. In Figure 3, the conditional probability (conditional on 50+ new cases) for five or more deaths showed an upward trend and much higher (0.22) for five or more fatalities for Italy, followed by Germany, Spain, and the UK, respectively. The risk for five or more fatalities (Figure 4) is highest for Australia, followed by France, the Netherlands, and New Zealand, respectively. The trend for the Netherlands is mostly flat and for New Zealand it is close to zero. Figure 5 presents the same trajectory for Canada, Japan, the Republic of Korea, and the USA. The path for Canada started ascending from 3 fatalities for the given new cases. The risk of deaths for five or more deaths is highest, around 0.12, followed by Japan, the Republic of Korea, and the USA, respectively. For China, the risk for five or more fatalities is highest, around 0.27 (Figure 6), followed by Malaysia, Morocco, and South Africa, respectively.
Figure 3.
Conditional probability of number of deaths for given 50+ new cases.
The entire world is trying hard to reach the same goal: the new cases of COVID-19 need to go to zero. We can achieve this goal only if we can end the pandemic everywhere. To reduce COVID-19 infection, countries around the world are implementing various restrictions. These measures included travel restrictions and quarantine of both suspected individuals and subjects who have had close contact with suspected cases. Reducing new infections toward zero will reduce the spread of this disease and the number of deaths. In this paper, we have analyzed the COVID-19 infected new cases and deaths (two correlated count outcomes) and assessed the impact of duration, region, and density. The results showed that increasing duration significantly increased the number of daily new cases and deaths, which is observed for the world data. The impact of population density showed a negative effect on both outcomes. We believe this could only happen as some densely populated countries may have taken rigorous measures to fight this pandemic. We further investigated the impact of density, duration along with regional impact variable by introducing the interaction effect in the model. The impact of density showed a positive relationship with the number of deaths using the interaction model.
Regarding regional impact compared to the EURO, the rest of the regions showed a reduction of new cases except for Others. However, the AMRO and EMRO regions showed an increase in deaths compared to the reference EURO region. Regarding regional impact compared to the EURO, the rest of the regions showed a reduction of new cases except for Others. However, AMRO and EMRO regions showed an increase in deaths compared to the reference EURO region. These findings are based on the world's daily reported new cases and deaths.
The country-specific analyses reveal variations in conclusions regarding the relationship between correlated outcomes and duration. With the introduction of various stringent measures to fight this pandemic by many countries, we expect for an extended period, both new cases and death should go down. For some countries with a longer duration, both the new cases and deaths are reducing, e.g., China and Myanmar. On the other hand, a longer duration increases the daily new cases for the Republic of Korea, Malaysia, Sri Lanka, Morocco, and Iran, but reduces the number of deaths. I took out reference to Tanzania because P-values are large. Many other countries showed an increase in both new cases and deaths. These findings may be an indication of the measures to fight this pandemic are more advantageous for some countries but not for all. We believe that analysis with the availability of more detailed data along with related risk factors may provide an in-depth understanding regarding the dynamics.
Acknowledgments
This research was supported by grants from the Natural Sciences and Engineering Research Council of Canada (NSERC). The authors are grateful to the referees for their helpful comments on the original draft of our paper, which have served to greatly improve the presentation of our results.
Conflict of interest
The authors declare no conflict of interests.
References
[1]
T. E. Carpenter, J. M. O'Brien, A. D. Hagerman, B. A. McCarl, Epidemic and economic impacts of delayed detection of foot-and-mouth disease: a case study of a simulated outbreak in California, J. Vet. Diagn. Invest., 23 (2011), 26-33. doi: 10.1177/104063871102300104
[2]
P. W. Uys, R. Warren, P. D. van Helden, M. Murray, T. C. Victor, Potential of rapid diagnosis for controlling drug-susceptible and drug-resistant tuberculosis in communities where Mycobacterium tuberculosis infections are highly prevalent, J. Clin. Mircrobiol., 47 (2009), 1484-1490. doi: 10.1128/JCM.02289-08
[3]
D. S. Hui, E. I. Azhar, T. A. Madani, F. Ntoumi, R. Kock, O. Dar, et al., The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health-The latest 2019 novel coronavirus outbreak in Wuhan, China, Int. J. Infect. Dis., 91 (2020), 264-266. doi: 10.1016/j.ijid.2020.01.009
[4]
C. Huang, Y. Wang, X. Li, L. Ren, J. Zhao, Y. Hu, et al., Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China, Lancet, 395 (2020), 497-506. doi: 10.1016/S0140-6736(20)30183-5
[5]
Q. Li, X. Guan, P. Wu, X. Wang, L. Zhou, Y. Tong, et al., Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia, N. Engl. J. Med., 382 (2020), 1199-1207. doi: 10.1056/NEJMoa2001316
[6]
WHO Coronavirus Disease (COVID-19) Dashboard, Map Data, 2020. Available from: https://covid19.who.int/.
Barry C. Arnold, Bangalore G. Manjunath,
Pseudo-Poisson Distributions with Concomitant Variables,
2023,
28,
2297-8747,
11,
10.3390/mca28010011
2.
Ming Guan,
Panel Associations Between Newly Dead, Healed, Recovered, and Confirmed Cases During COVID-19 Pandemic,
2022,
12,
2210-6014,
40,
10.1007/s44197-021-00019-z
Rafiqul Chowdhury, Gary Sneddon, M. Tariqul Hasan. Analyzing the effect of duration on the daily new cases of COVID-19 infections and deaths using bivariate Poisson regression: a marginal conditional approach[J]. Mathematical Biosciences and Engineering, 2020, 17(5): 6085-6097. doi: 10.3934/mbe.2020323
Rafiqul Chowdhury, Gary Sneddon, M. Tariqul Hasan. Analyzing the effect of duration on the daily new cases of COVID-19 infections and deaths using bivariate Poisson regression: a marginal conditional approach[J]. Mathematical Biosciences and Engineering, 2020, 17(5): 6085-6097. doi: 10.3934/mbe.2020323