Modeling toxicity of nitroaromatics: Comparative analysis of different variable and model selection methods

Milenković, Dejan; Batista, Jadranko; Lučić, Bono; Rasulev, Bakhtiyor

izvor podataka: crosbi ✓

Modeling toxicity of nitroaromatics: Comparative analysis of different variable and model selection methods (CROSBI ID 672319)

Prilog sa skupa u zborniku | sažetak izlaganja sa skupa | međunarodna recenzija

Milenković, Dejan ; Batista, Jadranko ; Lučić, Bono ; Rasulev, Bakhtiyor Modeling toxicity of nitroaromatics: Comparative analysis of different variable and model selection methods // Math/Chem/Comp 2018, 30th MC2 Conference : Book of abstract / Vančik, Hrvoje ; Cioslowski, Jerzy (ur.). Zagreb, 2018. str. 11-11

Podaci o odgovornosti

Autori

Milenković, Dejan ; Batista, Jadranko ; Lučić, Bono ; Rasulev, Bakhtiyor

Osnovni podaci na izvornom jeziku
Osnovni podaci na ostalim jezicima

Jezik

engleski

Naslov

Modeling toxicity of nitroaromatics: Comparative analysis of different variable and model selection methods

Sažetak

The study of nitrobenzene derivatives like nitroaromatic compounds is a very important task, due to their increased influence on living organisms and environment, in general [1]. These compounds and their derivatives and metabolic products show mutagenic and carcinogenic effects, as well as allergic reactions, endocryne system impairment and skin irritation [2]. In recent study the authors performed a comprehensive quantitative structure-toxicity relationship study on rats using a set of 90 nitroaromatics whose structures were described by three set having ~ 800, 20000 and 1500 different descriptors (based on 1D, 2D and 3D structures) calculated by the PaDEL, HiTQSAR and Dragon program, respectively [2]. Among them Dragon set contains the largest portion of 3D descriptors. Additionally, two smaller sets of 3D descriptors based on semiempirical and DFT structure were also calculated and all descriptors were selected into Multivariate Linear Regression (MLR) models by the method based on genetic algorithm, as implemented in QSARIns [3]. In modeling, experimental values of concentrations which had letal toxic outcome in 50% tested organisms were used as endopints (LD50). In summary, better models were obtained from sets of descriptors which are composed mostly of 1D or 2D descriptors (HiTQSAR, PaDEL) and if data sets contain only smaller sub-sets of 3D descriptors (Dragon). The worst models were those based on semiempirical and DFT 3D descriptors. To test the possibility of improvement of models obtained in [2] we performed an analysis using only Dragon 5.5 set of descriptors and extended previous study by: (1) enlarging data set by 100 novel nitroaromatics obtained from the LD50 database involved into the program TEST 4.2.1 [4] (given on U.S Env. Protection Agency web site), (2) the use of an additional algorithm which performs detailed descriptor selection by considering all combinations of fixed size sub- sets of descriptors [5], and (3) by the use of more robust Partial Least Square (PLS) method for model generation [6]. The novel set of 100 nitroaromatics contains compounds having high similarity with 90 compounds from ref. [2]. This set was used as an additional set for testing external predictivity of developed models. As the main results we have found that the Root Mean Squared Error (RMSE) of prediction by U.S. EPA method TEST 4.2.1 for 10 compounds (out of 90 nitroaromatics from [2]) which are not involved in the TEST 4.2.1 database is ~1.17 log units. This fact justify the exictence of need for improvement of existing models of nitroaromatics. RMSEs of fit or cross-validation of LD50 values for 90 nitroaromatics were between 0.28 - 0.7 on log scale in all considered PLS and MLR models. But, for the novel test set of 100 nitroaromatics RMSEs are ~0.75 - 1.0 (for all models). However, the lowest RMSE of prediction on 100 compounds of complex PLS models having 13 components (that are formed by ~40 initial descriptors) is 0.75, and the best one-descriptor (the number of phosphorous atoms) MLR model gives RMSE of 0.89. Almost all obtained MLR and PLS models have RMSEs in prediction on large external set comparable or even lower than the TEST 4.2.1 program, thus confirming that there is a space for improvement of structure-toxicity models of nitroaromatics. [1] P.-Z. Lang, G.-H Lu, Chem. J. Chin. Univ. 16 (1995) 1083–1087. [2] A. Gooch et al., Environ. Toxicol. Chem. 36 (2017) 2227–2233. [3] P. Gramatica et al., J. Comput. Chem. 34 (2013) 2121- 2132. [4] TEST 4.2.1 program, 2016, https://www.epa.gov/chemical-research/toxicity- estimation-software-tool-test. [5] B. Lučić, N. Trinajstić, J. Chem. Inf. Comput. Sci. 39 (1999) 121-132. [6] K. Roy, P. Ambure, Chemom. Intell. Lab. Syst. 159 (2016) 108–126.

Ključne riječi

toxicity modeling ; nitrobenzene derivatives ; variable selection ; model validation ; external predictivity ; MLR ; PLS

Napomena

nije evidentirano

Jezik

nije evidentirano

Naslov

nije evidentirano

Sažetak

nije evidentirano

Ključne riječi

nije evidentirano

Napomena

nije evidentirano

Podaci o prilogu

Stranice rada

11-11.

Godina izdavanja

2018.

Status objave rada

objavljeno

Podaci o matičnoj publikaciji

Naslov

Math/Chem/Comp 2018, 30th MC2 Conference : Book of abstract

Urednici

Vančik, Hrvoje ; Cioslowski, Jerzy

Izdavač

Zagreb:

Podaci o skupu

Skup

30th Math/Chem/Comp Conference

Vrsta sudjelovanja

poster

Datum održavanja skupa

18.06.2018-23.06.2018

Mjesto održavanja skupa

Dubrovnik, Hrvatska

Povezanost rada

Povezane osobe

Bono Lučić (autor/i)

Povezane ustanove

Institut Ruđer Bošković (098) (autorova ustanova)

Područje

Kemija

Poveznice

pmf.unizg.hr