why we are losing the war against covid 19 on the data front and how to reverse the CORD-Papers-2022-06-02 (Version 1)

Title: Why We Are Losing the War Against COVID-19 on the Data Front and How to Reverse the Situation
Abstract: With over 117 million COVID-19positive cases declared and the death count approaching 3 million we would expect that the highly digitalized health systems of high-income countries would have collected processed and analyzed large quantities of clinical data from patients with COVID-19. Those data should have served to answer important clinical questions such as: what are the risk factors for becoming infected? What are good clinical variables to predict prognosis? What kinds of patients are more likely to survive mechanical ventilation? Are there clinical subphenotypes of the disease? All these and many more are crucial questions to improve our clinical strategies against the epidemic and save as many lives as possible. One might assume that in the era of big data and machine learning there would be an army of scientists crunching petabytes of clinical data to answer these questions. However nothing could be further from the truth. Our health systems have proven to be completely unprepared to generate in a timely manner a flow of clinical data that could feed these analyses. Despite gigabytes of data being generated every day the vast quantity is locked in secure hospital data servers and is not being made available for analysis. Routinely collected clinical data are by and large regarded as a tool to inform decisions about individual patients and not as a key resource to answer clinical questions through statistical analysis. The initiatives to extract COVID-19 clinical data are often promoted by private groups of individuals and not by health systems and are uncoordinated and inefficient. The consequence is that we have more clinical data on COVID-19 than on any other epidemic in history but we have failed to analyze this information quickly enough to make a difference. In this viewpoint we expose this situation and suggest concrete ideas that health systems could implement to dynamically analyze their routine clinical data becoming learning health systems and reversing the current situation.
Published: 2021-05-05
Journal: JMIRx Med
DOI: 10.2196/20617
DOI_URL: http://doi.org/10.2196/20617
Author Name: Prieto Merino David
Author link: https://covid19-data.nist.gov/pid/rest/local/author/prieto_merino_david
Author Name: Bebiano Da Providencia E Costa Rui
Author link: https://covid19-data.nist.gov/pid/rest/local/author/bebiano_da_providencia_e_costa_rui
Author Name: Bacallao Gallestey Jorge
Author link: https://covid19-data.nist.gov/pid/rest/local/author/bacallao_gallestey_jorge
Author Name: Sofat Reecha
Author link: https://covid19-data.nist.gov/pid/rest/local/author/sofat_reecha
Author Name: Chung Sheng Chia
Author link: https://covid19-data.nist.gov/pid/rest/local/author/chung_sheng_chia
Author Name: Potts Henry
Author link: https://covid19-data.nist.gov/pid/rest/local/author/potts_henry
sha: 1c6a6e81046581ecb05682c5e82c860e7bed5dc5
license: cc-by
license_url: https://creativecommons.org/licenses/by/4.0/
source_x: Medline; PMC
source_x_url: https://www.medline.com/https://www.ncbi.nlm.nih.gov/pubmed/
pubmed_id: 34042100
pubmed_id_url: https://www.ncbi.nlm.nih.gov/pubmed/34042100
pmcid: PMC8104306
pmcid_url: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8104306
url: https://www.ncbi.nlm.nih.gov/pubmed/34042100/ https://doi.org/10.2196/20617
has_full_text: TRUE
Keywords Extracted from Text Content: extract COVID-19 patients COVID-19 extract OpenSAFELY people brain simpler-to-collect patients ... COVID-19 National Health Service NHS patient SARS-CoV-2 question(s http://med.jmirx.org/
Extracted Text Content in Record: First 5000 Characters:With over 117 million COVID-19-positive cases declared and the death count approaching 3 million, we would expect that the highly digitalized health systems of high-income countries would have collected, processed, and analyzed large quantities of clinical data from patients with COVID-19. Those data should have served to answer important clinical questions such as: what are the risk factors for becoming infected? What are good clinical variables to predict prognosis? What kinds of patients are more likely to survive mechanical ventilation? Are there clinical subphenotypes of the disease? All these, and many more, are crucial questions to improve our clinical strategies against the epidemic and save as many lives as possible. One might assume that in the era of big data and machine learning, there would be an army of scientists crunching petabytes of clinical data to answer these questions. However, nothing could be further from the truth. Our health systems have proven to be completely unprepared to generate, in a timely manner, a flow of clinical data that could feed these analyses. Despite gigabytes of data being generated every day, the vast quantity is locked in secure hospital data servers and is not being made available for analysis. Routinely collected clinical data are, by and large, regarded as a tool to inform decisions about individual patients, and not as a key resource to answer clinical questions through statistical analysis. The initiatives to extract COVID-19 clinical data are often promoted by private groups of individuals and not by health systems, and are uncoordinated and inefficient. The consequence is that we have more clinical data on COVID-19 than on any other epidemic in history, but we have failed to analyze this information quickly enough to make a difference. In this viewpoint, we expose this situation and suggest concrete ideas that health systems could implement to dynamically analyze their routine clinical data, becoming learning health systems and reversing the current situation. Many countries reacted late to the spread of the COVID-19 pandemic, although once they realized the seriousness of the situation, they took strong measures. The best-known measures relate to restrictions on population movement; other important implementations include increasing the capacity of health systems and the mobilization of the military to aid in this health emergency. Using a martial simile, it appears that governments have prepared their "health" armies and their populations for the war against the virus. An additional necessity is a good intelligence service to fight the war. This requires a system to collect data on the enemy and a group of analysts who can extract relevant information. Most current information systems pertaining to the pandemic focus on counting numbers of individuals tested, infected, hospitalized with serious conditions, recovered, and deceased. Data have also been collected to understand public behaviors [1] , some of which was planned for [2] . These kinds of data can serve to estimate epidemiological curves and predict how the pandemic might evolve (if everything continues as it has been so far), but they provide very limited insight into how frontline doctors can fight the virus. Epidemiological data often combine a limited number of variables for substratification of cases; these data categorize continuous variables for reporting purposes and often lack detailed information on variables collected during hospital care. Epidemiological curves do not allow us to answer clinical questions such as what the most relevant risk factors are for becoming infected, having symptoms, becoming seriously ill, or dying. They also do not allow us to study which treatments work better and what patient characteristics can influence the success or failure of the treatments. These are the questions that we need to answer to improve patient care and to rationalize the use of resources when health systems are at the limit of their capacities. These questions have not been answered satisfactorily. On March 24, 2020, a senior intensive care unit physician working in a big hospital in Madrid told the lead author that, "We learn things about the disease as we go along every day." Two weeks later, on April 9, 2020, a colleague working at a hospital affiliated with University College London Hospitals said, "...is a new disease with a pathology and clinical course that none of us know about." In between these two statements, thousands of patients have died or recovered from SARS-CoV-2 infection in Spain, the United Kingdom, and many other countries. It seems we hardly learned anything from those patients since months later we continue to ask the same clinical questions: what are the determinants for bad prognosis? How do we best treat patients? To answer clinical questions, we need clinical data from individual patients. We need a database where the anonymous clinical information of hospitali
Keywords Extracted from PMC Text: patients people SARS-CoV-2 extract brain COVID-19 " NHS question(s simpler-to-collect National Health Service OpenSAFELY patient's 's ... patient
Extracted PMC Text Content in Record: First 5000 Characters:Many countries reacted late to the spread of the COVID-19 pandemic, although once they realized the seriousness of the situation, they took strong measures. The best-known measures relate to restrictions on population movement; other important implementations include increasing the capacity of health systems and the mobilization of the military to aid in this health emergency. Using a martial simile, it appears that governments have prepared their "health" armies and their populations for the war against the virus. An additional necessity is a good intelligence service to fight the war. This requires a system to collect data on the enemy and a group of analysts who can extract relevant information. Most current information systems pertaining to the pandemic focus on counting numbers of individuals tested, infected, hospitalized with serious conditions, recovered, and deceased. Data have also been collected to understand public behaviors [1], some of which was planned for [2]. These kinds of data can serve to estimate epidemiological curves and predict how the pandemic might evolve (if everything continues as it has been so far), but they provide very limited insight into how frontline doctors can fight the virus. Epidemiological data often combine a limited number of variables for substratification of cases; these data categorize continuous variables for reporting purposes and often lack detailed information on variables collected during hospital care. Epidemiological curves do not allow us to answer clinical questions such as what the most relevant risk factors are for becoming infected, having symptoms, becoming seriously ill, or dying. They also do not allow us to study which treatments work better and what patient characteristics can influence the success or failure of the treatments. These are the questions that we need to answer to improve patient care and to rationalize the use of resources when health systems are at the limit of their capacities. These questions have not been answered satisfactorily. On March 24, 2020, a senior intensive care unit physician working in a big hospital in Madrid told the lead author that, "We learn things about the disease as we go along every day." Two weeks later, on April 9, 2020, a colleague working at a hospital affiliated with University College London Hospitals said, "...is a new disease with a pathology and clinical course that none of us know about." In between these two statements, thousands of patients have died or recovered from SARS-CoV-2 infection in Spain, the United Kingdom, and many other countries. It seems we hardly learned anything from those patients since months later we continue to ask the same clinical questions: what are the determinants for bad prognosis? How do we best treat patients? To answer clinical questions, we need clinical data from individual patients. We need a database where the anonymous clinical information of hospitalized patients with COVID-19 can be stored, curated, and made accessible to researchers. The structure of the data set does not have to be very complex since, for each patient, the disease involves basically a single hospital episode that does not usually last more than 4-5 weeks. The database would be continually fed with each hospital discharge. Statistical models to answer each clinical question can be programmed and automatically updated as more data come in. In this way, we would have a continuous information system growing with the epidemic and generating knowledge in real time that could be fed back to the frontline doctors treating patients. Therefore, the health system to fight the epidemic would have two subsystems working together: a care subsystem to treat patients and a knowledge subsystem to learn about the disease. This is what the literature has described as a learning health system [3,4]. Although it would be ideal to link up with the patient's medical history in specialized and primary care, this may not be necessary to answer the most pressing clinical questions about prognosis and the best therapeutic strategies. Initially, each health system (country/nation/region) would implement its own database including as many hospitals as possible although sharing information between countries would be advantageous. In addition, because the epidemic develops asynchronously in different countries, what we can learn from the data in one country can help to improve patient treatment in other countries. There are many private initiatives to create registries of patients with COVID-19 that include clinical data, some supported by professional societies [5-14]. Although understandable and praiseworthy, these initiatives are generally burdened with problems: Some governments and health institutions have implemented important initiatives to share individual patient clinical data on COVID-19. For example, the Mexican government, following a policy of open data, has been sharing clinical data on all COVID-19 cases si
PDF JSON Files: document_parses/pdf_json/1c6a6e81046581ecb05682c5e82c860e7bed5dc5.json
PMC JSON Files: document_parses/pmc_json/PMC8104306.xml.json
G_ID: why_we_are_losing_the_war_against_covid_19_on_the_data_front_and_how_to_reverse_the