Comparison of Variable Selection Methods. Prediction of Toxicity of Substituted Benzenes (CROSBI ID 549880)
Prilog sa skupa u zborniku | sažetak izlaganja sa skupa | međunarodna recenzija
Podaci o odgovornosti
Heberger, Károly ; Nikolić, Sonja
engleski
Comparison of Variable Selection Methods. Prediction of Toxicity of Substituted Benzenes
Variable selection is the key step in modeling, and searching quantitative structure – activity relationships. In the first period, cross-validation and especially leave-one-out procedure (LOO) was accepted as a suitable method for validation. Later, the scientific community agreed in using external validation, i.e. in testing the model performance on an independent prediction set.1, 2 Independent here means that this test should not be connected to variable selection, model building, parameter estimation, determining latent variables and similar steps. However, recent and extensive examinations have shown that LOO is much better than previously thought.3 Hence, our aim was to study and to compare various modeling and especially variable selection methods. A well-known data set has been chosen for comparison.4, 5 Our results indicate that (i) the split into three sets is too conservative ; (ii) the leave-one-out is not biased more than external validation ; (iii) PLS (without any variable selection) successfully competes with the best variable selection methods ; (iv) ordering of variable selection methods: CromRSel > FS > mBSS ~ PLS … . ; (v) it is worth to filter out the outliers (even from among the tests set) ; (vi) PCM and Marten’ s test provides comparable results ; (vii) consensus modeling of the four best methods overcomes the best performance for any individual model on the entire data set! 1. P. Gramatica, Principles of QSAR models validation: internal and external, QSAR Comb. Sci. 26 (2007) 694– 701. 2. A. Tropsa, P. Gramatica, V. K. Gombar, The importance of being earnest: Validation is the absolute essential for successful application and interpretation of QSPR models, QSAR Comb. Sci. 22 (2003) 69– 77. 3. D. M. Hawkins, Assessing model fit by cross validation, J. Chem. Inf. Comput. Sci. 43 (2003) 579– 586. 4. S. C. Basak, B. D. Gute, B. Lučić, S. Nikolić, N. Trinajstić, A comparative QSAR study of benzamidines complement-inhibitory activity and benzene derivatives acute toxicity, Comp. Chem. 24 (2000) 181– 191. 5. L. H. Hall, L. B. Kier, G. Phipps, Structure – activity relationship studies on the toxicities of benzene derivatives 1. An additivity model, J. Env. Tox. Chem. 3 (1984) 355– 365. 3
variable selection method ; toxicity ; benzenes
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
3-3.
2008.
objavljeno
Podaci o matičnoj publikaciji
MATH/CHEM/COMP 2008
Graovac, Ante ; Pokrić, Biserka ; Smrečki, Vilko
Zagreb: Institut Ruđer Bošković
978-953-6690-74-9
Podaci o skupu
MATH/CHEM/COMP 2008 COURSE
poster
16.06.2008-21.06.2008
Dubrovnik, Hrvatska