Pregled bibliografske jedinice broj: 982630
The use of square of correlation coefficient (q2) for estimating the quality of models in chemistry: A 30 years old question
The use of square of correlation coefficient (q2) for estimating the quality of models in chemistry: A 30 years old question // Math/Chem/Comp 2018, 30th MC2 Conference : Book of abstract / Vančik, Hrvoje ; Cioslowski, Jerzy (ur.).
Zagreb, 2018. str. 9-9 (predavanje, međunarodna recenzija, sažetak, znanstveni)
CROSBI ID: 982630 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
The use of square of correlation coefficient (q2)
for estimating the quality of models in chemistry:
A 30 years old question
Autori
Lučić, Bono
Vrsta, podvrsta i kategorija rada
Sažeci sa skupova, sažetak, znanstveni
Izvornik
Math/Chem/Comp 2018, 30th MC2 Conference : Book of abstract
/ Vančik, Hrvoje ; Cioslowski, Jerzy - Zagreb, 2018, 9-9
Skup
30th Math/Chem/Comp Conference
Mjesto i datum
Dubrovnik, Hrvatska, 18.06.2018. - 23.06.2018
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
model quality ; structure-property modeling
Sažetak
Correlation coefficient (or its square) has been used as the most common statistical parameter for estimating the quality of models in chemistry. It is usually calculated in each of three main validation procedure: fitting (r2), cross- validation (q2), and prediction on an external test set (2). Wide use of q2 in cross- validation technique was initiated by a very often cited paper by Cramer et al. [1], in which comparative molecular field analysis based on partial least squares method was introduced in chemical modeling. The use of q2 was additionally accelerated by its involvement in the regulatory perspectives of the U.S.A. Environmental Protection Agency (EPA) [2] and in the Guidance document of OECD principles on the validation of (quantitative) structure-activity relationship ((Q)SAR) models [3]. In latter applications of q2 several researchers noticed its strange properties like overestimation or underestimation when applied on external (test) data set. Therefore, two additional alternative variants of q2 [4, 5] were proposed, but the problem remains unsolved. All these results are reviewed and their statistical foundations are re-analysed, considering all three validation procedures, i.e. fitting, cross-validation and prediction. Obtained results will be illustrated on literature data sets. It comes out that the best estimate of the model quality can be obtained by simultaneous calculation and comparison of root-mean-square errors of fit, cross-validation and prediction [1] R. D. Cramer et al., Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. J. Am. Chem. Soc. 110 (1988) 5959–5967. [2] M. Zeeman, et al., U.S. EPA regulatory perspectives on the use of QSAR for new and existing chemical evaluations, SAR QSAR Environ. Res. 3 (1995) 179– 201. [3] OECD guidelines concerning QSARs, 2007, pp. 55–65. http://www.oecd.org/officialdocuments/publicdispla ydocumentpdf/?doclanguage=en&cote=env/jm/m ono(2007)2, accessed: May 25, 2018. [4] G. Schüürmann et al., External validation and prediction employing the predictive squared correlation coefficient - test set activity mean vs training set activity mean. J. Chem. Inf. Model. 48 (2008) 2140– 2145. [5] V. Consonni et al., Comments on the definition of the q2 parameter for QSAR validation. J. Chem. Inf. Model. 49 (2009) 1669–1678.
Izvorni jezik
Engleski
Znanstvena područja
Kemija