Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi

Cross-column chromatographic retention time prediction in proteomics: a machine learning approach (CROSBI ID 706211)

Prilog sa skupa u zborniku | prošireni sažetak izlaganja sa skupa

Žuvela, Petar ; Lovrić, Mario ; Lučić, Bono ; Liu, Jay ; Kern, Roman ; Baczek, Tomasz Cross-column chromatographic retention time prediction in proteomics: a machine learning approach // HPLC2019 Kyoto - 49th International Symposium on High Performance Liquid Phase Separations and Related Techniques. 2019. doi: 10.13140/RG.2.2.32898.02248

Podaci o odgovornosti

Žuvela, Petar ; Lovrić, Mario ; Lučić, Bono ; Liu, Jay ; Kern, Roman ; Baczek, Tomasz

engleski

Cross-column chromatographic retention time prediction in proteomics: a machine learning approach

Quantitative structure-retention relationships (QSRR) although widespread for prediction of retention time in reversed-phase liquid chromatography (RP-LC) suffer from the same limitation. Typically they are built for a specific set of chromatographic conditions (e.g., stationary phase, mobile phase composition, pH, temperature, total gradient time). To overcome this limitation, in this work we aimed to build global QSRR models for prediction of retention time of synthetic peptides across six RP-LC columns with varied experimental conditions. In this work, QSRR models were based on three a priori selected molecular descriptors: sum of gradient retention times of 20 natural amino acids (logSumAA), van der Waals volume (logvdWvol.), and hydrophobicity (clogP) related to the retention mechanism of RP-LC separation of peptides. A multitude of machine learning methods was compared: random forests (RF), adaptive boosting (ADA), and gaussian process regression (GPR). The models were comprehensively optimized through 3-fold cross- validation (CV) and validated through an external validation set. Chemical domain of applicability was also defined, while statistical significance of the models was tested using CV-ANOVA. All the models were also compared to the conventional linear model built using partial least squares (PLS). Results have shown that all the machine learning methods outperformed PLS with %RMSEP ranging from 14.99 % ; for RF, to 26.35 % for ADA. On the other hand, PLS exhibited a %RMSEP of 40.56 %. The novel ensemble and mixture models revealed mechanisms behind black-box global QSRR models and paved the way to resolving the principal limitation of QSRR modelling. The models have shown the highest feature importance for sum of gradient retention times (logSumAA), followed by van der Waals volume (logvdWvol.), and hydrophobicity (clogP). The promising results of this study show the potential of machine learning for improved peptide identification, retention time standardization and integration into state-of-the-art LC-MS/MS proteomics workflows.

chromatography ; machine learning ; qsrr

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

AB00002

2019.

objavljeno

10.13140/RG.2.2.32898.02248

Podaci o matičnoj publikaciji

HPLC2019 Kyoto - 49th International Symposium on High Performance Liquid Phase Separations and Related Techniques

Podaci o skupu

HPLC2019 Kyoto - 49th International Symposium on High Performance Liquid Phase Separations and Related Techniques

poster

01.12.2019-05.12.2019

Kyoto, Japan

Povezanost rada

Interdisciplinarne prirodne znanosti, Interdisciplinarne tehničke znanosti, Kemija, Računarstvo

Poveznice