accurately differentiating covid 19 other viral infection and healthy individuals CORD-Papers-2021-10-25 (Version 1)

Title: Accurately Differentiating COVID-19, Other Viral Infection, and Healthy Individuals Using Multimodal Features via Late Fusion Learning
Abstract: Effectively identifying COVID-19 patients using non-PCR clinical data is critical for the optimal clinical outcomes. Currently, there is a lack of comprehensive understanding of various biomedical features and appropriate technical approaches to accurately detecting COVID-19 patients. In this study, we recruited 214 confirmed COVID-19 patients in non-severe (NS) and 148 in severe (S) clinical type, 198 non-infected healthy (H) participants and 129 non-COVID viral pneumonia (V) patients. The participants' clinical information (23 features), lab testing results (10 features), and thoracic CT scans upon admission were acquired as three input feature modalities. To enable late fusion of multimodality data, we developed a deep learning model to extract a 10-feature high-level representation of the CT scans. Exploratory analyses showed substantial differences of all features among the four classes. Three machine learning models (k-nearest neighbor kNN, random forest RF, and support vector machine SVM) were developed based on the 43 features combined from all three modalities to differentiate four classes (NS, S, V, and H) at once. All three models had high accuracy to differentiate the overall four classes (95.4%-97.7%) and each individual class (90.6%-99.9%). Multimodal features provided substantial performance gain from using any single feature modality. Compared to existing binary classification benchmarks often focusing on single feature modality, this study provided a novel and effective breakthrough for clinical applications. Findings and the analytical workflow can be used as clinical decision support for current COVID-19 and other clinical applications with high-dimensional multimodal biomedical features.
Published: 8/21/2020
DOI: 10.1101/2020.08.18.20176776
DOI_URL: http://doi.org/10.1101/2020.08.18.20176776
Author Name: Xu, M
Author link: https://covid19-data.nist.gov/pid/rest/local/author/xu_m
Author Name: Ouyang, L
Author link: https://covid19-data.nist.gov/pid/rest/local/author/ouyang_l
Author Name: Gao, Y
Author link: https://covid19-data.nist.gov/pid/rest/local/author/gao_y
Author Name: Chen, Y
Author link: https://covid19-data.nist.gov/pid/rest/local/author/chen_y
Author Name: Yu, T
Author link: https://covid19-data.nist.gov/pid/rest/local/author/yu_t
Author Name: Li, Q
Author link: https://covid19-data.nist.gov/pid/rest/local/author/li_q
Author Name: Sun, K
Author link: https://covid19-data.nist.gov/pid/rest/local/author/sun_k
Author Name: Bao, F S
Author link: https://covid19-data.nist.gov/pid/rest/local/author/bao_f_s
Author Name: Safarnejad, L
Author link: https://covid19-data.nist.gov/pid/rest/local/author/safarnejad_l
Author Name: Wen, J
Author link: https://covid19-data.nist.gov/pid/rest/local/author/wen_j
Author Name: Jiang, C
Author link: https://covid19-data.nist.gov/pid/rest/local/author/jiang_c
Author Name: Chen, T
Author link: https://covid19-data.nist.gov/pid/rest/local/author/chen_t
Author Name: Han, L
Author link: https://covid19-data.nist.gov/pid/rest/local/author/han_l
Author Name: Zhang, H
Author link: https://covid19-data.nist.gov/pid/rest/local/author/zhang_h
Author Name: Yu, Z
Author link: https://covid19-data.nist.gov/pid/rest/local/author/yu_z
Author Name: Liu, X
Author link: https://covid19-data.nist.gov/pid/rest/local/author/liu_x
Author Name: Yan, T
Author link: https://covid19-data.nist.gov/pid/rest/local/author/yan_t
Author Name: Li, H
Author link: https://covid19-data.nist.gov/pid/rest/local/author/li_h
Author Name: Robinson, P
Author link: https://covid19-data.nist.gov/pid/rest/local/author/robinson_p
Author Name: Zhu, B
Author link: https://covid19-data.nist.gov/pid/rest/local/author/zhu_b
Author Name: Liu, J
Author link: https://covid19-data.nist.gov/pid/rest/local/author/liu_j
Author Name: Liu, Y
Author link: https://covid19-data.nist.gov/pid/rest/local/author/liu_y
Author Name: Zhang, Z
Author link: https://covid19-data.nist.gov/pid/rest/local/author/zhang_z
Author Name: Ge, Y
Author link: https://covid19-data.nist.gov/pid/rest/local/author/ge_y
Author Name: Chen, S
Author link: https://covid19-data.nist.gov/pid/rest/local/author/chen_s
sha: ae05274f4d697a91934241401db6606783cc75f8
license: medrxiv
source_x: MedRxiv; WHO
source_x_url: https://www.who.int/
url: https://doi.org/10.1101/2020.08.18.20176776 http://medrxiv.org/cgi/content/short/2020.08.18.20176776v1?rss=1
has_full_text: TRUE
Keywords Extracted from Text Content: thoracic CT COVID-19 patients COVID-19 patients extract H convolutional neural V participants hemoglobin C-reactive TBIL Wuhan Union bilirubin H class Kunshan People elerly kidney upper KS-tests L-2 blood samples muscle hs-CRP matrix HGB Patient participants CPD CREA neutrophil medRxiv V. neural networks Non-COVID viral creatine Technology participants NE Hubei Province https://github.com/forrestbao/corona/tree/master/ct VOM FC2 medRxiv preprint https://doi.org/10.1101/2020.08 CNN6 FiO2 S. human radiologist CAR HYP WBC hyperplane DIR CHL Patients ±0.2 human body ±0.5 Altmayer https://doi.org/10.1101/2020.08.18.20176776 doi COVID-19 (S) class blood cell V Class-specific throat patient SEs CNNs Balrtusaitis scikit-learn oxygen Hubei Provincial CDC's gamma=1/43 MUC LDH Fig. 6 CRP clinical/lab LY cardiovascular F1 kNN COVID-like Python Fig. 1 radiologist p<0.05 GGOs HIF Wuhan 50mA tube FTG lymphocyte k-nearest neighbor (kNN Arga LOF hemoglobin (HGB COVID-19 (S) Wuhan ( S3 individuals thoracic p=2 hyperparameter chest adenovirus Gini medRxiv preprint 70 blood DNNs Fig. 4 COVID Metlay neutrophils 1x[23+10+10]) row vector JSJK2020-8003-01 90.3% H Hubei 50yr COVID-19 patients COVID-19 Low-dimensional 300mmHg supplementary Fig. S2 Baltruschat CT SOR layer 2 Kunshan 86%-99 FC layer leaf node patients V and S supplementary Fig. S3 hsTNI, ddimer V patients Fig. 2 Fig. 3 SHB PLT results(Xl 60%-80 Daniells FC1 H and V 10-element SARS-CoV-2 layer radial human
Extracted Text Content in Record: First 5000 Characters:We trained and validated late fusion deep learning-machine learning models to predict nonsevere COVID-19, severe COVID-19, non-COVID viral infection, and healthy classes from clinical, lab testing, and CT scan features extracted from convolutional neural network and achieved predictive accuracy of >96% to differentiate all four classes at once based on a large dataset of 689 participants. Abstract Effectively identifying COVID-19 patients using non-PCR clinical data is critical for the optimal clinical outcomes. Currently, there is a lack of comprehensive understanding of various biomedical features and appropriate technical approaches to accurately detecting COVID-19 patients. In this study, we recruited 214 confirmed COVID-19 patients in non-severe (NS) and 148 in severe (S) clinical type, 198 non-infected healthy (H) participants and 129 non-COVID viral pneumonia (V) patients. The participants' clinical information (23 features), lab testing results (10 features), and thoracic CT scans upon admission were acquired as three input feature modalities. To enable late fusion of multimodality data, we developed a deep learning model to extract a 10-feature high-level representation of the CT scans. Exploratory analyses showed substantial differences of all features among the four classes. Three machine learning models (k-nearest neighbor kNN, random forest RF, and support vector machine SVM) were developed based on the 43 features combined from all three modalities to differentiate four classes (NS, S, V, and H) at once. All three models had high accuracy to differentiate the overall four classes (95.4%-97.7%) and each individual class (90.6%-99.9%). Multimodal features provided substantial performance gain from using any single feature modality. Compared to existing binary classification benchmarks often focusing on single feature modality, this study provided a novel and effective breakthrough for clinical applications. Findings and the analytical workflow can be used as clinical decision support for current COVID-19 and other clinical applications with high-dimensional multimodal biomedical features. Introduction hematological biochemistry change. Because of the challenge of asymptomatic infection of COVID-19, other types of biomedical information such as lab testing results can be alternative diagnostic decision support evidence. It is possible that our current definition and/or understanding of "asymptomatic infection" may be extended by more intrinsic, quantitative, and subtle biomedical evidence (Daniells et al. 2020 , Gandhi et al. 2020) . Despite tremendous advances of alternative and complementary diagnostic evidence for COVID-19, there are still substantial clinical knowledge gaps and technical challenges that hinder our efforts on harnessing the power of various biomedical data. First of all, most current studies usually focus on one type (modality) of multiple modalities of diagnostic data and do not consider the potential interactions and added interpretability among them. For example, can we leverage both CT scan and clinical information to develop a more accurate COVID-19 diagnostic decision support system )? As stated earlier, the human body is a unity against SARS-CoV-2 infection. Biomedical imaging and clinical approaches evaluate different aspects of the clinical consequences of COVID-19. By combining these different modalities of biomedical information, we may be able to achieve a more comprehensive characterization of COVID-19. This is referred to as a "multimodal biomedical information" research. Secondly, while there are ample accurate deep learning (DL) algorithms/models/tools especially using biomedical imaging, most of them focus on the efforts of differentiating COVID-19 from non-infected healthy individuals. A moderately trained radiologist can differentiate CT scans of COVID-19 patients from healthy individuals with high accuracy as well, making the current efforts of developing DL algorithms not clinically useful for the binary classification problem ). The more critical and urgent clinical question is not only to differentiate COVID-19 from non-infected healthy individuals but also from other non-COVID viral infections , Altmayer et al. 2020 . Patients with non-COVID viral infection also present GGO in their CT scans. Therefore, the specificity of GGO as diagnostic criteria of COVID-19 is low (Ai et al. 2020 ). In addition, both non-severe COVID-19 and non-COVID viral infection patients share some easily confusing common symptoms (Qu et al. 2020 ). For frontline clinicians, effectively differentiating non-severe COVID-19 from non-COVID viral infection is therefore a challenging task without confirmatory molecular testings, which may not be readily available at the time of admission. Similarly, differentiating asymptomatic and pre-symptomatic (including those in non-severe clinical type) COVID-19 from non-infected healthy individuals is another major clinical challenge (Ooi and L
PDF JSON Files: document_parses/pdf_json/ae05274f4d697a91934241401db6606783cc75f8.json
G_ID: accurately_differentiating_covid_19_other_viral_infection_and_healthy_individuals
S2 ID: 221209094