Pregled bibliografske jedinice broj: 982610
Modeling toxicity of nitroaromatics: Comparative analysis of different variable and model selection methods
Modeling toxicity of nitroaromatics: Comparative analysis of different variable and model selection methods // Math/Chem/Comp 2018, 30th MC2 Conference : Book of abstract / Vančik, Hrvoje ; Cioslowski, Jerzy (ur.).
Zagreb, 2018. str. 11-11 (poster, međunarodna recenzija, sažetak, znanstveni)
CROSBI ID: 982610 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Modeling toxicity of nitroaromatics: Comparative
analysis of different variable and model selection
methods
Autori
Milenković, Dejan ; Batista, Jadranko ; Lučić, Bono ; Rasulev, Bakhtiyor
Vrsta, podvrsta i kategorija rada
Sažeci sa skupova, sažetak, znanstveni
Izvornik
Math/Chem/Comp 2018, 30th MC2 Conference : Book of abstract
/ Vančik, Hrvoje ; Cioslowski, Jerzy - Zagreb, 2018, 11-11
Skup
30th Math/Chem/Comp Conference
Mjesto i datum
Dubrovnik, Hrvatska, 18.06.2018. - 23.06.2018
Vrsta sudjelovanja
Poster
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
toxicity modeling ; nitrobenzene derivatives ; variable selection ; model validation ; external predictivity ; MLR ; PLS
Sažetak
The study of nitrobenzene derivatives like nitroaromatic compounds is a very important task, due to their increased influence on living organisms and environment, in general [1]. These compounds and their derivatives and metabolic products show mutagenic and carcinogenic effects, as well as allergic reactions, endocryne system impairment and skin irritation [2]. In recent study the authors performed a comprehensive quantitative structure-toxicity relationship study on rats using a set of 90 nitroaromatics whose structures were described by three set having ~ 800, 20000 and 1500 different descriptors (based on 1D, 2D and 3D structures) calculated by the PaDEL, HiTQSAR and Dragon program, respectively [2]. Among them Dragon set contains the largest portion of 3D descriptors. Additionally, two smaller sets of 3D descriptors based on semiempirical and DFT structure were also calculated and all descriptors were selected into Multivariate Linear Regression (MLR) models by the method based on genetic algorithm, as implemented in QSARIns [3]. In modeling, experimental values of concentrations which had letal toxic outcome in 50% tested organisms were used as endopints (LD50). In summary, better models were obtained from sets of descriptors which are composed mostly of 1D or 2D descriptors (HiTQSAR, PaDEL) and if data sets contain only smaller sub-sets of 3D descriptors (Dragon). The worst models were those based on semiempirical and DFT 3D descriptors. To test the possibility of improvement of models obtained in [2] we performed an analysis using only Dragon 5.5 set of descriptors and extended previous study by: (1) enlarging data set by 100 novel nitroaromatics obtained from the LD50 database involved into the program TEST 4.2.1 [4] (given on U.S Env. Protection Agency web site), (2) the use of an additional algorithm which performs detailed descriptor selection by considering all combinations of fixed size sub- sets of descriptors [5], and (3) by the use of more robust Partial Least Square (PLS) method for model generation [6]. The novel set of 100 nitroaromatics contains compounds having high similarity with 90 compounds from ref. [2]. This set was used as an additional set for testing external predictivity of developed models. As the main results we have found that the Root Mean Squared Error (RMSE) of prediction by U.S. EPA method TEST 4.2.1 for 10 compounds (out of 90 nitroaromatics from [2]) which are not involved in the TEST 4.2.1 database is ~1.17 log units. This fact justify the exictence of need for improvement of existing models of nitroaromatics. RMSEs of fit or cross-validation of LD50 values for 90 nitroaromatics were between 0.28 - 0.7 on log scale in all considered PLS and MLR models. But, for the novel test set of 100 nitroaromatics RMSEs are ~0.75 - 1.0 (for all models). However, the lowest RMSE of prediction on 100 compounds of complex PLS models having 13 components (that are formed by ~40 initial descriptors) is 0.75, and the best one-descriptor (the number of phosphorous atoms) MLR model gives RMSE of 0.89. Almost all obtained MLR and PLS models have RMSEs in prediction on large external set comparable or even lower than the TEST 4.2.1 program, thus confirming that there is a space for improvement of structure-toxicity models of nitroaromatics. [1] P.-Z. Lang, G.-H Lu, Chem. J. Chin. Univ. 16 (1995) 1083–1087. [2] A. Gooch et al., Environ. Toxicol. Chem. 36 (2017) 2227–2233. [3] P. Gramatica et al., J. Comput. Chem. 34 (2013) 2121- 2132. [4] TEST 4.2.1 program, 2016, https://www.epa.gov/chemical-research/toxicity- estimation-software-tool-test. [5] B. Lučić, N. Trinajstić, J. Chem. Inf. Comput. Sci. 39 (1999) 121-132. [6] K. Roy, P. Ambure, Chemom. Intell. Lab. Syst. 159 (2016) 108–126.
Izvorni jezik
Engleski
Znanstvena područja
Kemija