Machine learning methods for cross-column prediction of retention time in reversed-phased liquid chromatography (CROSBI ID 716276)
Prilog sa skupa u zborniku | sažetak izlaganja sa skupa | međunarodna recenzija
Podaci o odgovornosti
Lovrić, Mario ; Žuvela, Petar ; Lučić, Bono ; Liu, Jay J. ; Kern, Roman ; Bączek, Tomasz
engleski
Machine learning methods for cross-column prediction of retention time in reversed-phased liquid chromatography
Quantitative structure-retention relationships (QSRR) were employed to build global models for prediction of chromatographic retention time of synthetic peptides across six RP-LC-MS/MS columns and varied experimental conditions. The global QSRR models were based on only three a priori selected molecular descriptors: sum of gradient retention times of 20 natural amino acids (logSumAA), van der Waals volume (logvdWvol.), and hydrophobicity (clogP) related to the retention mechanism of RP-LC separation of peptides. Three machine learning regression methods were compared: random forests (RF), partial least squares (PLS), and adaptive boosting (ADA). All the models were comprehensively optimized through 3-fold cross- validation (CV) and validated through an external validation set. The chemical domain of applicability was also defined. Percentage root mean square error of prediction (%RMSEP) was used as an external validation metric. Results have shown that RF exhibited a %RMSEP of 14.99 % ; PLS exhibited a %RMSEP of 40.561 % ; whereas ADA exhibited a %RMSEP of 26.35 %. The ensemble models considerably outperform the conventional PLS-based QSRR model. Novel methods of treebased model explainability were employed to reveal mechanisms behind black-box global ensemble QSRR models. The models revelead the highest feature importance for sum of gradient retention times (logSumAA), followed by van der Waals volume (logvdWvol.), and hydrophobicity (clogP). The promising results of this study show the potential of machine learning for improved peptide identification, retention time standardization and integration into state- of-the-art LC-MS/MS proteomics workflows.
machine learning ; proteomics ; QSAR ; HPLC
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
72-72.
2019.
objavljeno
Podaci o matičnoj publikaciji
8th IAPC Meeting Book of Abstracts
Mandić, Zoran
Split: International Association of Physical Chemists (IAPC)
Podaci o skupu
8th IAPC Meeting: Eighth World Conference on Physico-Chemical Methods in Drug Discovery & Fifth World Conference on ADMET and DMPK
poster
09.09.2019-11.09.2019
Split, Hrvatska
Povezanost rada
Interdisciplinarne prirodne znanosti, Interdisciplinarne tehničke znanosti, Kemija