exploring surveillance data biases when estimating the reproduction number with insights CORD-Papers-2022-06-02 (Version 1)

Title: Exploring surveillance data biases when estimating the reproduction number: with insights into subpopulation transmission of COVID-19 in England
Abstract: The time-varying reproduction number (R(t): the average number of secondary infections caused by each infected person) may be used to assess changes in transmission potential during an epidemic. While new infections are not usually observed directly they can be estimated from data. However data may be delayed and potentially biased. We investigated the sensitivity of R(t) estimates to different data sources representing COVID-19 in England and we explored how this sensitivity could track epidemic dynamics in population sub-groups. We sourced public data on test-positive cases hospital admissions and deaths with confirmed COVID-19 in seven regions of England over March through August 2020. We estimated R(t) using a model that mapped unobserved infections to each data source. We then compared differences in R(t) with the demographic and social context of surveillance data over time. Our estimates of transmission potential varied for each data source with the relative inconsistency of estimates varying across regions and over time. R(t) estimates based on hospital admissions and deaths were more spatio-temporally synchronous than when compared to estimates from all test positives. We found these differences may be linked to biased representations of subpopulations in each data source. These included spatially clustered testing and where outbreaks in hospitals care homes and young age groups reflected the link between age and severity of the disease. We highlight that policy makers could better target interventions by considering the source populations of R(t) estimates. Further work should clarify the best way to combine and interpret R(t) estimates from different data sources based on the desired use. This article is part of the theme issue Modelling that shaped the early COVID-19 pandemic response in the UK.
Published: 2021-07-19
Journal: Philosophical transactions of the Royal Society of London. Series B Biological sciences
DOI: 10.1098/rstb.2020.0283
DOI_URL: http://doi.org/10.1098/rstb.2020.0283
Author Name: Sherratt Katharine
Author link: https://covid19-data.nist.gov/pid/rest/local/author/sherratt_katharine
Author Name: Abbott Sam
Author link: https://covid19-data.nist.gov/pid/rest/local/author/abbott_sam
Author Name: Meakin Sophie R
Author link: https://covid19-data.nist.gov/pid/rest/local/author/meakin_sophie_r
Author Name: Hellewell Joel
Author link: https://covid19-data.nist.gov/pid/rest/local/author/hellewell_joel
Author Name: Munday James D
Author link: https://covid19-data.nist.gov/pid/rest/local/author/munday_james_d
Author Name: Bosse Nikos
Author link: https://covid19-data.nist.gov/pid/rest/local/author/bosse_nikos
Author Name: Jit Mark
Author link: https://covid19-data.nist.gov/pid/rest/local/author/jit_mark
Author Name: Funk Sebastian
Author link: https://covid19-data.nist.gov/pid/rest/local/author/funk_sebastian
sha: 3fb1b3cd88e05c11185e2c797502a739dc994ed9
license: cc-by
license_url: https://creativecommons.org/licenses/by/4.0/
source_x: Medline; PMC; WHO
source_x_url: https://www.medline.com/https://www.ncbi.nlm.nih.gov/pubmed/https://www.who.int/
pubmed_id: 34053260
pubmed_id_url: https://www.ncbi.nlm.nih.gov/pubmed/34053260
pmcid: PMC8165604
pmcid_url: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8165604
url: https://doi.org/10.1098/rstb.2020.0283 https://www.ncbi.nlm.nih.gov/pubmed/34053260/
has_full_text: TRUE
Keywords Extracted from Text Content: R t UK COVID-19 linelist SARS-Cov-2 line COVID-19 https://doi.org/10.5281/zenodo.4029075 Stan [27] https://coronavirus.data.gov.uk/ about-data [19] patient R t upper CrI C t https://www.health.org.uk/newsandcomment/charts-and-infographics/covid-19-policy-tracker [24] . UK Government's patients FF100 CrI https://www.gov coronavirus SARS-CoV-2 UK R t coronavirus.data.gov.uk φ COVID-19 UK National Health Service SI2 SPI-M SI2B epiforecasts.io/covid/posts/national/ Office UK-specific NHS SI2A test-positives UK Github
Extracted Text Content in Record: First 5000 Characters:The time-varying reproduction number (R t : the average number of secondary infections caused by each infected person) may be used to assess changes in transmission potential during an epidemic. While new infections are not usually observed directly, they can be estimated from data. However, data may be delayed and potentially biased. We investigated the sensitivity of R t estimates to different data sources representing COVID-19 in England, and we explored how this sensitivity could track epidemic dynamics in population sub-groups. We sourced public data on test-positive cases, hospital admissions and deaths with confirmed COVID-19 in seven regions of England over March through August 2020. We estimated R t using a model that mapped unobserved infections to each data source. We then compared differences in R t with the demographic and social context of surveillance data over time. Our estimates of transmission potential varied for each data source, with the relative inconsistency of estimates varying across regions and over time. R t estimates based on hospital admissions and deaths were more spatio-temporally synchronous than when compared to estimates from all test positives. We found these differences may be linked to biased representations of subpopulations in each data source. These included spatially clustered testing, and where outbreaks in hospitals, care homes, and young age groups reflected the link between age and severity of the disease. We highlight that policy makers could better target interventions by considering the source populations of R t estimates. Further work should clarify the best way to combine and interpret R t estimates from different data sources based on the desired use. This article is part of the theme issue 'Modelling that shaped the early COVID-19 pandemic response in the UK'. The time-varying reproduction number (R t : the average number of secondary infections caused by each infected person) may be used to assess changes in transmission potential during an epidemic. While new infections are not usually observed directly, they can be estimated from data. However, data may be delayed and potentially biased. We investigated the sensitivity of R t estimates to different data sources representing COVID-19 in England, and we explored how this sensitivity could track epidemic dynamics in population sub-groups. We sourced public data on test-positive cases, hospital admissions and deaths with confirmed COVID-19 in seven regions of England over March through August 2020. We estimated R t using a model that mapped unobserved infections to each data source. We then compared differences in R t with the demographic and social context of surveillance data over time. Our estimates of transmission potential varied for each data source, with the relative inconsistency of estimates varying across regions and over time. R t estimates based on hospital admissions and deaths were more spatio-temporally synchronous than when compared to estimates from all test positives. We found these differences may be linked to biased representations of subpopulations in each data source. These included spatially clustered testing, and where outbreaks in hospitals, care homes, and young age groups reflected the link between age and severity of the disease. We highlight that policy makers could better target interventions by considering the source populations of R t estimates. Further work should clarify the best way to combine and interpret R t estimates from different data sources based on the desired use. This article is part of the theme issue 'Modelling that shaped the early COVID-19 pandemic response in the UK'. Within six months of its emergence in late 2019, the novel coronavirus SARS-CoV-2 had caused over six million cases of disease (COVID-19) worldwide [1] . Its rapid initial spread and high death rate prompted global policy interventions to prevent continued transmission, with widespread temporary bans on social interaction outside the household [2] . Introducing and adjusting such policy measures depend on a judgement in balancing continued transmission potential with the multidimensional consequences of interventions. It is, therefore, critical to inform the implementation of policy measures with a clear and timely understanding of ongoing epidemic dynamics [3, 4] . In principle, transmission could be tracked by directly recording all new infections. In practice, real-time monitoring of the COVID-19 epidemic relies on surveillance of indicators that are subject to different levels of bias and delay. In England, widely available surveillance data across the population include: (i) the number of positive tests, biased by changing test availability and practice, and delayed by the time from infection to symptom onset (if testing is symptom-based), from symptom onset to a decision to be tested and from test to test result; (ii) the number of new hospital admissions, biased by differential severity
Keywords Extracted from PMC Text: φ UK-specific NHS Dt=∑τ⁡ξτIt−τ SPI-M CrI Stan [27] " linelist UK Government's coronavirus.data.gov.uk UK SI2B R [13,25,26 coronavirus SARS-CoV-2 90%CI line test-positives National Health Service SI2A upper CrI COVID-19 20–49 SI2 COVID-19 UK FF100 patients SARS-Cov-2 patient
Extracted PMC Text Content in Record: First 5000 Characters:Within six months of its emergence in late 2019, the novel coronavirus SARS-CoV-2 had caused over six million cases of disease (COVID-19) worldwide [1]. Its rapid initial spread and high death rate prompted global policy interventions to prevent continued transmission, with widespread temporary bans on social interaction outside the household [2]. Introducing and adjusting such policy measures depend on a judgement in balancing continued transmission potential with the multidimensional consequences of interventions. It is, therefore, critical to inform the implementation of policy measures with a clear and timely understanding of ongoing epidemic dynamics [3,4]. In principle, transmission could be tracked by directly recording all new infections. In practice, real-time monitoring of the COVID-19 epidemic relies on surveillance of indicators that are subject to different levels of bias and delay. In England, widely available surveillance data across the population include: (i) the number of positive tests, biased by changing test availability and practice, and delayed by the time from infection to symptom onset (if testing is symptom-based), from symptom onset to a decision to be tested and from test to test result; (ii) the number of new hospital admissions, biased by differential severity that triggers care seeking and hospitalization, and additionally delayed by the time to develop severe diseases; and (iii) the number of new deaths due to COVID-19, biased by the differential risk of death and the exact definition of a COVID-19 death, and further delayed by the time to death. Each of these indicators provides a different view on the epidemic and therefore contains potentially useful information. However, any interpretation of their behaviour needs to reflect these biases and lags and is best done in combination with the other indicators. One approach that allows this in a principled manner is to use the different datasets to separately track the time-varying reproduction number, Rt, the average number of secondary infections generated by each new infected person [5]. Because Rt quantifies changes in infection levels, it is independent of the level of overall ascertainment as long as this does not change over time or is explicitly accounted for [6]. At the same time, the underlying observations in each data source may result from different lags from infection to observation. However, if these delays are correctly specified then transmission behaviour over time can be consistently compared via estimates of Rt. Different methods exist to estimate the time-varying reproduction number, and in the UK a number of mathematical and statistical methods have been used to produce estimates used to inform policy [7–9]. Empirical estimates of Rt can be achieved by estimating time-varying patterns in transmission events from mapping to a directly observed time-series indicator of infection such as reported symptomatic cases. This can be based on the probabilistic assignment of transmission pairs [10], the exponential growth rate [11] or the renewal equation [12,13]. Alternatively, Rt can be estimated via mechanistic models that explicitly compartmentalize the disease transmission cycle into stages from susceptible through exposed, infectious and recovered [14,15]. This can include accounting for varying population structures and context-specific biases in observation processes, before fitting to a source of observed cases. Across all methods, key parameters include the time after an infection to the onset of symptoms in the infecting and infected, and the source of data used as a reference point for earlier transmission events [16,17]. In this study, we used a modelling framework based on the renewal equation, adjusting for delays in observation to estimate regional and national reproduction numbers of SARS-Cov-2 across England. The same method was repeated for each of three sources of data that are available in real time. After assessing differences in Rt estimates by data source, we explored why this variation may exist. We compared the divergence between Rt estimates with spatio-temporal variation in case detection, and the proportion at risk of severe disease, represented by the age distribution of test-positive cases and hospital admissions and the proportion of deaths in care homes. Three sources of data provided the basis for our Rt estimates. Time-series case data were available by specimen date of test. This was a de-duplicated dataset of COVID-19 positive tests notified from all National Health Service (NHS) settings (Pillar One of the UK Government's testing strategy) [18] and by commercial partners in community settings outside of healthcare (Pillar Two). Hospital admissions were also available by date of admission if a patient had tested positive prior to admission, or by the day preceding diagnosis if they were tested after admission. Death data were available by date of death and included only those that
PDF JSON Files: document_parses/pdf_json/3fb1b3cd88e05c11185e2c797502a739dc994ed9.json
PMC JSON Files: document_parses/pmc_json/PMC8165604.xml.json
G_ID: exploring_surveillance_data_biases_when_estimating_the_reproduction_number_with_insights