Wastewater sampling for the detection and monitoring of SARS-CoV-2 has been developed and applied at an unprecedented pace, however uncertainty remains when interpreting the measured viral RNA signals and their spatiotemporal variation. The proliferation of measurements that are below a quantifiable threshold, usually during non-endemic periods, poses a further challenge to interpretation and time-series analysis of the data. Inspired by research in the use of a custom Kalman smoother model to estimate the true level of SARS-CoV-2 RNA concentrations in wastewater, we propose an alternative left-censored dynamic linear model. Cross-validation of both models alongside a simple moving average, using data from 286 sewage treatment works across England, allows for a comprehensive validation of the proposed approach. The presented dynamic linear model is more parsimonious, has a faster computational time and is represented by a more flexible modelling framework than the equivalent Kalman smoother. Furthermore we show how the use of wastewater data, transformed by such models, correlates more closely with regional case rate positivity as published by the Office for National Statistics (ONS) Coronavirus (COVID-19) Infection Survey. The modelled output is more robust and is therefore capable of better complementing traditional surveillance than untransformed data or a simple moving average, providing additional confidence and utility for public health decision making.
La détection et la surveillance du SARS-CoV-2 dans les eaux usées ont été développées et réalisées à un rythme sans précédent, mais l'interprétation des mesures de concentrations en ARN viral, et de leurs variations spatio-temporelles, pose question. En particulier, l'importante proportion de mesures en deçà du seuil de quantification, généralement pendant les périodes non endémiques, constitue un défi pour l'analyse de ces séries temporelles. Inspirés par un travail de recherche ayant produit un lisseur de Kalman adapté pour estimer les concentrations réelles en ARN de SARS-CoV-2 dans les eaux usées à partir de ce type de données, nous proposons un nouveau modèle linéaire dynamique avec censure à gauche. Une validation croisée de ces lisseurs, ainsi que d'un simple lissage par moyenne glissante, sur des données provenant de 286 stations d'épuration couvrant l'Angleterre, valide de façon complète l'approche proposée. Le modèle présenté est plus parcimonieux, offre un cadre de modélisation plus flexible et nécessite un temps de calcul réduit par rapport au Lisseur de Kalman équivalent. Les données issues des eaux usées ainsi lissées sont en outre plus fortement corrélées avec le taux d'incidence régional produit par le bureau des statistiques nationales (ONS) Coronavirus Infection Survey. Elles se montrent plus robustes que les données brutes, ou lissées par simple moyenne glissante, et donc plus à même de compléter la surveillance traditionnelle, renforçant ainsi la confiance en l'épidémiologie fondée sur les eaux usées et son utilité pour la prise de décisions de santé publique.
Citation: Luke Lewis-Borrell, Jessica Irving, Chris J. Lilley, Marie Courbariaux, Gregory Nuel, Leon Danon, Kathleen M. O'Reilly, Jasmine M. S. Grimsley, Matthew J. Wade, Stefan Siegert. Robust smoothing of left-censored time series data with a dynamic linear model to infer SARS-CoV-2 RNA concentrations in wastewater[J]. AIMS Mathematics, 2023, 8(7): 16790-16824. doi: 10.3934/math.2023859
Wastewater sampling for the detection and monitoring of SARS-CoV-2 has been developed and applied at an unprecedented pace, however uncertainty remains when interpreting the measured viral RNA signals and their spatiotemporal variation. The proliferation of measurements that are below a quantifiable threshold, usually during non-endemic periods, poses a further challenge to interpretation and time-series analysis of the data. Inspired by research in the use of a custom Kalman smoother model to estimate the true level of SARS-CoV-2 RNA concentrations in wastewater, we propose an alternative left-censored dynamic linear model. Cross-validation of both models alongside a simple moving average, using data from 286 sewage treatment works across England, allows for a comprehensive validation of the proposed approach. The presented dynamic linear model is more parsimonious, has a faster computational time and is represented by a more flexible modelling framework than the equivalent Kalman smoother. Furthermore we show how the use of wastewater data, transformed by such models, correlates more closely with regional case rate positivity as published by the Office for National Statistics (ONS) Coronavirus (COVID-19) Infection Survey. The modelled output is more robust and is therefore capable of better complementing traditional surveillance than untransformed data or a simple moving average, providing additional confidence and utility for public health decision making.
La détection et la surveillance du SARS-CoV-2 dans les eaux usées ont été développées et réalisées à un rythme sans précédent, mais l'interprétation des mesures de concentrations en ARN viral, et de leurs variations spatio-temporelles, pose question. En particulier, l'importante proportion de mesures en deçà du seuil de quantification, généralement pendant les périodes non endémiques, constitue un défi pour l'analyse de ces séries temporelles. Inspirés par un travail de recherche ayant produit un lisseur de Kalman adapté pour estimer les concentrations réelles en ARN de SARS-CoV-2 dans les eaux usées à partir de ce type de données, nous proposons un nouveau modèle linéaire dynamique avec censure à gauche. Une validation croisée de ces lisseurs, ainsi que d'un simple lissage par moyenne glissante, sur des données provenant de 286 stations d'épuration couvrant l'Angleterre, valide de façon complète l'approche proposée. Le modèle présenté est plus parcimonieux, offre un cadre de modélisation plus flexible et nécessite un temps de calcul réduit par rapport au Lisseur de Kalman équivalent. Les données issues des eaux usées ainsi lissées sont en outre plus fortement corrélées avec le taux d'incidence régional produit par le bureau des statistiques nationales (ONS) Coronavirus Infection Survey. Elles se montrent plus robustes que les données brutes, ou lissées par simple moyenne glissante, et donc plus à même de compléter la surveillance traditionnelle, renforçant ainsi la confiance en l'épidémiologie fondée sur les eaux usées et son utilité pour la prise de décisions de santé publique.
[1] | F. Balloux, Mass COVID testing and sequencing is unsustainable – here's how future surveillance can be done, The Conversation, 2022. Available from: https://theconversation.com/mass-covid-testing-and-sequencing-is-unsustainable-heres-how-future-surveillance-can-be-done-177404. |
[2] | M. J. Wade, D. Jones, A. Singer, A. Hart, A. Corbishley, C. Spence, et al., Wastewater COVID-19 monitoring in the UK: summary for SAGE, 2020. Available from: https://www.gov.uk/government/publications/defrajbc-wastewater-covid-19-monitoring-in-the-uk-summary-19-november-2020. |
[3] | A. Bivins, D. North, A. Ahmad, W. Ahmed, E. Alm, F. Been, et al., Wastewater-Based Epidemiology: Global Collaborative to Maximize Contributions in the Fight Against COVID-19, Environ. Sci. Technol., 54 (2020), 7754–7757. https://doi.org/10.1021/acs.est.0c02388 doi: 10.1021/acs.est.0c02388 |
[4] | UC Merced Researchers, COVIDPoops19: Summary of Global SARS-CoV-2 Wastewater Monitoring Efforts, 2022. Available from: https://www.arcgis.com/apps/dashboards/c778145ea5bb4daeb58d31afee389082. |
[5] | H. R. Safford, K. Shapiro, H. N. Bischel, Wastewater analysis can be a powerful public health tool - if it's done sensibly, Proc. Natl. Acad. Sci., 119 (2022), e2119600119. https://doi.org/10.1073/pnas.2119600119 doi: 10.1073/pnas.2119600119 |
[6] | D. A. Larsen, H. Green, M. B. Collins, B. L. Kmush, Wastewater monitoring, surveillance and epidemiology: a review of terminology for a common understanding, FEMS Microbes, 2 (2021), xtab011. https://doi.org/10.1093/femsmc/xtab011 doi: 10.1093/femsmc/xtab011 |
[7] | N. Sims, B. Kasprzyk-Hordern, Future perspectives of wastewater-based epidemiology: Monitoring infectious disease spread and resistance to the community level, Environ. Int., 139 (2020), 105689. https://doi.org/10.1016/j.envint.2020.105689 doi: 10.1016/j.envint.2020.105689 |
[8] | M. Huizere, T. L. ter Lak, P. de Voogt, A. P. van Wezel, Wastewater-based epidemiology for illicit drugs: A critical review on global data, Water Res., 207 (2021), 117789. https://doi.org/10.1016/j.watres.2021.117789 doi: 10.1016/j.watres.2021.117789 |
[9] | M. J. Wade, A. Lo Jacomo, E. Armenise, M. R. Brown, J. T. Bunce, G. J. Cameron, et al., Understanding and managing uncertainty and variability for wastewater monitoring beyond the pandemic: Lessons learned from the United Kingdom national COVID-19 surveillance programmes, J. Hazard. Mater., 424 (2022), 127456. https://doi.org/10.1002/essoar.10507606.1 doi: 10.1002/essoar.10507606.1 |
[10] | M. A. Cohen, P. B. Ryan, Observations Less than the Analytical Limit of Detection: A New Approach, JAPCA, 39 (1989), 328–329. https://doi.org/10.1080/08940630.1989.10466534 doi: 10.1080/08940630.1989.10466534 |
[11] | D. R. Helsel, Fabricating data: How substituting values for nondetects can ruin results, and what can be done about it, Chemosphere, 65 (2006), 2434–2439. https://doi.org/10.1016/j.chemosphere.2006.04.051 doi: 10.1016/j.chemosphere.2006.04.051 |
[12] | M. Courbariaux, N. Cluzel, S. Wang, V. Maréchal, L. Moulin, S. Wurtzer, et al., A Flexible Smoother Adapted to Censored Data With Outliers and Its Application to SARS-CoV-2 Monitoring in Wastewater, Front. Appl. Math. Stat., 8 (2022), 836349. https://doi.org/10.3389/fams.2022.836349 doi: 10.3389/fams.2022.836349 |
[13] | S. Wurtzer, P. Waldman, M. Levert, N. Cluzel, J. L. Almayrac, C. Charpentier, et al., SARS-CoV-2 genome quantification in wastewaters at regional and city scale allows precise monitoring of the whole outbreaks dynamics and variants spreading in the population, Sci. Tot. Environ., 810 (2022), 152213. https://doi.org/10.1016/j.scitotenv.2021.152213 doi: 10.1016/j.scitotenv.2021.152213 |
[14] | Stan Development Team, Stan Modeling Language User's Guide and Reference Manual, Version 2.30, 2022. |
[15] | C. Sweetapple, M. J. Wade, J. M. S. Grimsley, J. T. Bunce, P. Melville-Shreeve, A. S. Chen, Dynamic population normalisation in wastewater-based epidemiology for improved understanding of the SARS-CoV-2 prevalence: a multi-site study, J. Water Health, (2023), in press. https://doi.org/10.2166/wh.2023.318 |
[16] | A. L. Rainey, S. Liang, J. H. Bisesi Jr., T. Sabo-Attwood, A. T. Maurelli, A multistate assessment of population normalization factors for wastewater-based epidemiology of COVID-19, PLOS ONE, 18 (2023), e0284370. https://doi.org/10.1371/journal.pone.0284370 doi: 10.1371/journal.pone.0284370 |
[17] | A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, D. B. Rubin Bayesian Data Analysis, 3rd Ed. CRC Press, 2013. https://doi.org/10.1201/b16018 |
[18] | E. Pebesma, Simple Features for R: Standardized Support for Spatial Vector Data, The R Journal, 10 (2018), 439–446. https://doi.org/10.32614/RJ-2018-009 doi: 10.32614/RJ-2018-009 |
[19] | M. Morvan, A. Lo Jacomo, C. Souque, M. J. Wade, T. Hoffmann, K. Pouwels, et al., An analysis of 45 large-scale wastewater sites in England to estimate SARS-CoV-2 community prevalence, Nat. Commun., 13 (2022), 4313. https://doi.org/10.1038/s41467-022-31753-y doi: 10.1038/s41467-022-31753-y |
[20] | C. S. McMahan, S. Self, L. Rennert, C. Kalbaugh, D. Kriebel, D. Graves, et al., COVID-19 wastewater epidemiology: a model to estimate infected populations, Lancet Planet. Health, 5 (2021), e874–881. https://doi.org/10.1016/S2542-5196(21)00230-8 doi: 10.1016/S2542-5196(21)00230-8 |
[21] | X. Li, J. Kulandaivelu, S. Zhang, J. Shi, M. Sivakumar, J. Mueller, et al., Data-driven estimation of COVID-19 community prevalence through wastewater-based epidemiology, Sci. Total Environ., 789 (2021), 147947. https://doi.org/10.1016/j.scitotenv.2021.147947 doi: 10.1016/j.scitotenv.2021.147947 |
[22] | Office for National Statistics, Coronavirus (COVID-19) Infection Survey: methods and further information, 2022. Available from: https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/conditionsanddiseases/methodologies/covid19infectionsurveypilotmethodsandfurtherinformation#coronavirus-covid-19-infection-survey. |
[23] | UK Health Security Agency, EMHP wastewater monitoring of SARS-CoV-2 in England: 15 July 2020 to 30 March 2022, 2022. Available from: https://www.gov.uk/government/publications/monitoring-of-sars-cov-2-rna-in-england-wastewater-monthly-statistics-15-july-2020-to-30-march-2022/emhp-wastewater-monitoring-of-sars-cov-2-in-england-15-july-2020-to-30-march-2022. |
[24] | G. Vogel, Signals from the sewer, Science, 375 (2022), 1100–1104. https://doi.org/10.1126/science.adb1874 doi: 10.1126/science.adb1874 |
[25] | A. Xiao, F. Wu, M. Bushman, J. Zhang, M. Imakaev, P. R. Chai, et al., Metrics to relate COVID-19 wastewater data to clinical testing dynamics, Water Res., 212 (2022), 118070. https://doi.org/10.1016/j.watres.2022.118070 doi: 10.1016/j.watres.2022.118070 |
[26] | P. M. D'Aoust, X. Tian, S. Tasneem Towhid, A. Xiao, E. Mercier, N. Hegazy, et al., Wastewater to clinical case (WC) ratio of COVID-19 identifies insufficient clinical testing, onset of new variants of concern and population immunity in urban communities, Sci. Total Environ., 853 (2022), 158547. https://doi.org/10.1016/j.scitotenv.2022.158547 doi: 10.1016/j.scitotenv.2022.158547 |