Contribution of multiblock methods for predicting the severity of COVID-19 from clinical, biochemical and metabolomic data
Résumé
During the first wave of the COVID pandemic, elderly men with metabolic diseases were found to be at higher risk of severe forms. In order to improve the early detection of patients who will require intensive care, 62 patients were included in our study. Phenotypic data (age, sex, obesity) and blood biochemistry measurements were collected within the first 2 days of hospitalisation. In addition, a multiplatform approach, consisting in untargeted metabolomics (HILIC and C18) and complementary targeted lipidomics LC-MS/MS analyses, was carried out. The patients were classified at the end of their follow-up as “moderate” or “severe”, according to the severity of the developed COVID. Exploratory (PCA, consensus-PCA) and supervised (PLS-DA, multiblock-PLS-DA (MB-PLS-DA), sequential and orthogonalised-PLS-DA (SO-PLS-DA) to predict the COVID severity) data analyses were then performed, using the R package rchemo, to identify early and predictive biomarkers of evolution towards severe forms of COVID. PCA on separate blocks failed to visualise a variability in the datasets related to the COVID severity on the first axes. PLS-DA models, were valid (based on permutation tests) only for biochemistry and HILIC data, with cross-validated error rates (5-fold repeated 30 times) of 26.56±0.92% and 31.08±2.54%, respectively. Consensus-PCA on the 7 datasets, revealed only a very subtle effect in the data related to the COVID severity. In contrast, the MB-PLS-DA and SO-PLS-DA models were valid, with cross-validated error rates of 25.32±2.40% and 23.33±3.08%, respectively. The most important variables in the MB-PLS-DA model were age, sex and markers of inflammatory response from the different biochemistry and metabolomic datasets. The SO-PLS-DA model used only one latent variable from the biochemistry block and one from a lipidomics dataset. The significant variables were inflammatory markers and metabolites associated with altered metabolic status. In conclusion, MB-PLS-DA may be an interesting first step to highlight and explain slight effects in the datasets related to the outcome. SO-PLS-DA may be used to deepen the analysis and bring out complementary information. It may also be used to select blocks of data, an advantage for biomarker validation purposes.
Origine | Fichiers produits par l'(les) auteur(s) |
---|