Contribution of multiblock methods for predicting the severity of COVID-19 from clinical, biochemical and metabolomic data

During the first wave of the COVID pandemic, elderly men with metabolic diseases were found to be at higher risk of severe forms. In order to improve the early detection of patients who will require intensive care, 62 patients were included in our study. Phenotypic data (age, sex, obesity) and blood biochemistry measurements were collected within the first 2 days of hospitalisation. In addition, a multiplatform approach, consisting in untargeted metabolomics (HILIC and C18) and complementary targeted lipidomics LC-MS/MS analyses, was carried out. The patients were classified at the end of their follow-up as “moderate” or “severe”, according to the severity of the developed COVID. Exploratory (PCA, consensus-PCA) and supervised (PLS-DA, multiblock-PLS-DA (MB-PLS-DA), sequential and orthogonalised-PLS-DA (SO-PLS-DA) to predict the COVID severity) data analyses were then performed, using the R package rchemo, to identify early and predictive biomarkers of evolution towards severe forms of COVID. PCA on separate blocks failed to visualise a variability in the datasets related to the COVID severity on the first axes. PLS-DA models, were valid (based on permutation tests) only for biochemistry and HILIC data, with cross-validated error rates (5-fold repeated 30 times) of 26.56±0.92% and 31.08±2.54%, respectively. Consensus-PCA on the 7 datasets, revealed only a very subtle effect in the data related to the COVID severity. In contrast, the MB-PLS-DA and SO-PLS-DA models were valid, with cross-validated error rates of 25.32±2.40% and 23.33±3.08%, respectively. The most important variables in the MB-PLS-DA model were age, sex and markers of inflammatory response from the different biochemistry and metabolomic datasets. The SO-PLS-DA model used only one latent variable from the biochemistry block and one from a lipidomics dataset. The significant variables were inflammatory markers and metabolites associated with altered metabolic status. In conclusion, MB-PLS-DA may be an interesting first step to highlight and explain slight effects in the datasets related to the outcome. SO-PLS-DA may be used to deepen the analysis and bring out complementary information. It may also be used to select blocks of data, an advantage for biomarker validation purposes.

Mots clés

Domaines

Fichier principal

CCM2025_BoA_Brandolini.pdf (185.4 Ko)

Origine	Fichiers produits par l'(les) auteur(s)
Licence	Autorisation HAL

Connectez-vous pour contacter le contributeur

https://hal.inrae.fr/hal-05265568

Soumis le : mercredi 17 septembre 2025-16:37:36

Dernière modification le : mercredi 29 octobre 2025-13:38:06

Archivage à long terme le : jeudi 18 décembre 2025-19:25:23

Dates et versions

hal-05265568 , version 1 (17-09-2025)

Licence

Autorisation HAL

Identifiants

HAL Id : hal-05265568 , version 1

Citer

Marion Brandolini-Bunlon, Benoit B. Jaillais, Audrey Le Gouellec, Jean-Michel J. -M. Roger, Estelle Pujos-Guillot. Contribution of multiblock methods for predicting the severity of COVID-19 from clinical, biochemical and metabolomic data. Colloquium Chemiometricum Mediterraneum (CCM2025), Société Française de Statistiques - groupe Chimiométrie, Sep 2025, Porquerolles, France. ⟨hal-05265568⟩

Exporter

Collections

666 Consultations

47 Téléchargements