Research article Special Issues

Comparative assessment of parameter estimation methods in the presence of overdispersion: a simulation study

  • Received: 25 January 2019 Accepted: 21 April 2019 Published: 16 May 2019
  • The Poisson distribution is commonly assumed as the error structure for count data; however, empirical data may exhibit greater variability than expected based on a given statistical model. Greater variability could point to model misspecification, such as missing crucial information about the epidemiology of the disease or changes in population behavior. When the mechanism producing the apparent overdispersion is unknown, it is typically assumed that the variance in the data exceeds the mean (by some scaling factor). Thus, a probability distribution that allows for overdispersion (negative binomial, for example) may better represent the data. Here, we utilize simulation studies to assess how misspecifying the error structure affects parameter estimation results, specifically bias and uncertainty, as a function of the level of random noise in the data. We compare results for two parameter estimation methods: nonlinear least squares and maximum likelihood estimation with Poisson error structure. We analyze two phenomenological models the generalized growth model and generalized logistic growth model to assess how results of parameter estimation are affected by the level of overdispersion underlying in the data. We use simulation to obtain confidence intervals and mean squared error of parameter estimates. We also analyze the impact of the amount of data, or ascending phase length, on the results of the generalized growth model for increasing levels of overdispersion. The results show a clear pattern of increasing uncertainty, or confidence interval width, as the overdispersion in the data increases. While maximum likelihood estimation consistently yields narrower confidence intervals and smaller mean squared error, differences between the two methods were minimal and not practically significant. At moderate levels of overdispersion, both estimation methods yielded similar performance. Importantly, it is shown that issues of parameter uncertainty and bias in the presence of overdispersion can be mitigated with the inclusion of more data.

    Citation: Kimberlyn Roosa, Ruiyan Luo, Gerardo Chowell. Comparative assessment of parameter estimation methods in the presence of overdispersion: a simulation study[J]. Mathematical Biosciences and Engineering, 2019, 16(5): 4299-4313. doi: 10.3934/mbe.2019214

    Related Papers:

  • The Poisson distribution is commonly assumed as the error structure for count data; however, empirical data may exhibit greater variability than expected based on a given statistical model. Greater variability could point to model misspecification, such as missing crucial information about the epidemiology of the disease or changes in population behavior. When the mechanism producing the apparent overdispersion is unknown, it is typically assumed that the variance in the data exceeds the mean (by some scaling factor). Thus, a probability distribution that allows for overdispersion (negative binomial, for example) may better represent the data. Here, we utilize simulation studies to assess how misspecifying the error structure affects parameter estimation results, specifically bias and uncertainty, as a function of the level of random noise in the data. We compare results for two parameter estimation methods: nonlinear least squares and maximum likelihood estimation with Poisson error structure. We analyze two phenomenological models the generalized growth model and generalized logistic growth model to assess how results of parameter estimation are affected by the level of overdispersion underlying in the data. We use simulation to obtain confidence intervals and mean squared error of parameter estimates. We also analyze the impact of the amount of data, or ascending phase length, on the results of the generalized growth model for increasing levels of overdispersion. The results show a clear pattern of increasing uncertainty, or confidence interval width, as the overdispersion in the data increases. While maximum likelihood estimation consistently yields narrower confidence intervals and smaller mean squared error, differences between the two methods were minimal and not practically significant. At moderate levels of overdispersion, both estimation methods yielded similar performance. Importantly, it is shown that issues of parameter uncertainty and bias in the presence of overdispersion can be mitigated with the inclusion of more data.


    加载中


    [1] P. McCullagh and J. A. Nelder, Generalized linear models. Monographs on statistics and applied probability. London ; New York : Chapman and Hall, 1983., 1983.
    [2] P. Yan and G. Chowell, Quantitative methods for investigating infectious disease outbreaks, Submitted for publication, 2019.
    [3] R. Williams, Heteroskedasticity, 2015.
    [4] C. Dean and E. Lundy, Overdispersion, 2014. In Wiley StatsRef: Statistics Reference Online.
    [5] K. Roosa and G. Chowell, Assessing parameter identifiability in compartmental dynamic models using a computational approach: application to infectious disease transmission models, Theor. Biol. Med. Mod., 16 (2019), 1.
    [6] B. Efron and R. Tibshirani, An introduction to the bootstrap. Monographs on statistics and applied probability: 57. New York: Chapman and Hall, c1993., 1993.
    [7] G. Chowell, Fitting dynamic models to epidemic outbreaks with quantified uncertainty: A primer for parameter uncertainty, identifiability, and forecasts, Infect. Dis. Model., 2 (2017), 379–398.
    [8] R. Anderson and R. May, Infectious Diseases of Humans: Dynamics and Control, New York ; Oxford University Press, 1991., 1991.
    [9] O. Diekmann, J. A. Heesterbeek and J. A. Metz, On the definition and the computation of the basic reproduction ratio r0 in models for infectious diseases in heterogeneous populations, J. Math. Biol., 28 (1990), 365–382.
    [10] G. Chowell, C. Viboud, J. M. Hyman, et al., The western africa ebola virus disease epidemic exhibits both global exponential and local polynomial growth rates, 2014.
    [11] C. Viboud, L. Simonsen and G. Chowell, A generalized-growth model to characterize the early ascending phase of infectious disease outbreaks, Epidemics, 15 (2016), 27–37.
    [12] D. W. Shanafelt, G. Jones, M. Lima, et al., Forecasting the 2001 foot-and-mouth disease epidemic in the uk, ECOHEALTH, 15 (2018), 338–347.
    [13] G. Chowell, D. Hincapie-Palacio, J. Ospina, et al., Using phenomenological models to characterize transmissibility and forecast patterns and final burden of zika epidemics, PLOS Currents Outbreaks, 2016.
    [14] G. Chowell, H. Nishiura and L. M. A. Bettencourt, Comparative estimation of the reproduction number for pandemic influenza from daily case notification data, J. R. Soc. Interface, 4 (2007), 155–166.
    [15] L. Dinh, G. Chowell and R. Rothenberg, Growth scaling for the early dynamics of hiv/aids epidemics in brazil and the influence of socio-demographic factors, J. Theor. Biol., 442 (2018), 79–86.
    [16] B. Pell, Y. Kuang, C. Viboud, et al., Using phenomenological models for forecasting the 2015 ebola challenge, Epidemics, 22(The RAPIDD Ebola Forecasting Challenge), (2018), 62–70.
    [17] T. Ganyani, K. Roosa, C. Faes, et al., Assessing the relationship between epidemic growth scaling and epidemic size: The 201416 ebola epidemic in west africa, Epidemiol. Infect., 147 (2018), e27.
    [18] C. Z. Mooney, Monte Carlo Simulation. Sage University Paper series on Quantitiative Applications in the Social Sciences. Thousand Oaks, CA: Sage, 1997.
    [19] I. J. Myung, Tutorial on maximum likelihood estimation, J. Math. Psychol., 47 (2003), 90.
    [20] K. Kashin, Statistical inference: Maximum likelihood estimation, 2014.
  • Reader Comments
  • © 2019 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(4384) PDF downloads(555) Cited by(16)

Article outline

Figures and Tables

Figures(6)  /  Tables(2)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog