Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 404027

Improvement of Ensemble of Multi-Regression Structure-Toxicity Models by Clustering of Molecules in Descriptor Space


Bašic, Ivan; Lučić, Bono; Nikolić, Sonja; Papeš-Šokčević, Lidija; Nadramija, Damir
Improvement of Ensemble of Multi-Regression Structure-Toxicity Models by Clustering of Molecules in Descriptor Space // International Conference of Computational Methods in Sciences and Engineering 2008 ; Special Volume of the American Institute of Physics (AIP) - Conference Proceedings of ICCMSE 2008. Vol. 1148 / Simos, Theodore (ur.).
Melville: American Institute of Physics, 2009. str. 408-411 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 404027 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Improvement of Ensemble of Multi-Regression Structure-Toxicity Models by Clustering of Molecules in Descriptor Space

Autori
Bašic, Ivan ; Lučić, Bono ; Nikolić, Sonja ; Papeš-Šokčević, Lidija ; Nadramija, Damir

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
International Conference of Computational Methods in Sciences and Engineering 2008 ; Special Volume of the American Institute of Physics (AIP) - Conference Proceedings of ICCMSE 2008. Vol. 1148 / Simos, Theodore - Melville : American Institute of Physics, 2009, 408-411

ISBN
978-0-7354-0685-8

Skup
International Conference of Computational Methods in Sciences and Engineering 2008

Mjesto i datum
Kreta, Grčka, 25.-30.09.2008

Vrsta sudjelovanja
Poster

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
Acute aquatic toxicity; Organic molecules; QSAR models; Molecular descriptors; Distance based similarity; Clustering of molecules; Ensemble of multi-regression models; Clustered ensembles

Sažetak
For selected data set published by Russom et al. (Environ. Toxicol. Chem. 16, 948-967 (1997)) containing 704 organic molecules with measured acute aquatic toxicity data (96-h LC50 tests) we calculated data set of more than 1400 molecular descriptors by the Dragon 5.0 program.[1] After we excluded descriptors that have almost constant values, and those having very low correlation with the logarithm of LC50 values on the training set, about 620 descriptors remained and were used in the modeling process. Data set of molecules was randomly partitioned into the training and test set containing 560 and 144 molecules, respectively. We developed and compared two kinds of ensemble of both linear and nonlinear multi-regression models (1) normal ensembles and (2) ensembles obtained by the clustering of molecules according to their similarity (clustered ensembles). Clustering of molecules was performed by calculating their Euclidian distances in normalized descriptor space. In this method, the final model was developed only on those molecules from the training set that are close (measured using Euclidian distance in normalized descriptor space) to the selected molecule from the test set. Although results obtained by normal ensembles are very good (e.g. nonlinear ensemble of 8-descriptor models ; rtrain = 0.91, strain = 0.54, rtest = 0.76, rtest = 0.80), significant improvement is obtained by taking into account clustering of molecules in development of ensembles of linear models (e.g. 200 3-descriptor models in ensemble: rtrain = 0.91, strain = 0.53, rtest = 0.836, rtest = 0.70 ; or for 200 5-descriptor models in ensemble rtrain = 0.94, strain = 0.45, rtest = 0.84, rtest = 0.70). These results clearly indicate that the use of information about similarity between molecules can improve structure-toxicity models, and we also expect that this could be valid generally.

Izvorni jezik
Engleski

Znanstvena područja
Kemija, Računarstvo

Napomena
Doi:10.1063/1.3225331



POVEZANOST RADA


Projekt / tema
079-0000000-3211 - Odnos strukture i aktivnosti flavonoida (Amić, Dragan, MZOS - )
098-1770495-2919 - Razvoj metoda za modeliranje svojstava bioaktivnih molekula i proteina (Lučić, Bono, MZOS - )

Ustanove
Institut "Ruđer Bošković", Zagreb,
Nastavni zavod za javno zdravstvo "Dr. Andrija Štampar",
PLIVA HRVATSKA d.o.o.

Citiraj ovu publikaciju

Bašic, Ivan; Lučić, Bono; Nikolić, Sonja; Papeš-Šokčević, Lidija; Nadramija, Damir
Improvement of Ensemble of Multi-Regression Structure-Toxicity Models by Clustering of Molecules in Descriptor Space // International Conference of Computational Methods in Sciences and Engineering 2008 ; Special Volume of the American Institute of Physics (AIP) - Conference Proceedings of ICCMSE 2008. Vol. 1148 / Simos, Theodore (ur.).
Melville: American Institute of Physics, 2009. str. 408-411 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Bašic, I., Lučić, B., Nikolić, S., Papeš-Šokčević, L. & Nadramija, D. (2009) Improvement of Ensemble of Multi-Regression Structure-Toxicity Models by Clustering of Molecules in Descriptor Space. U: Simos, T. (ur.)International Conference of Computational Methods in Sciences and Engineering 2008 ; Special Volume of the American Institute of Physics (AIP) - Conference Proceedings of ICCMSE 2008. Vol. 1148.
@article{article, editor = {Simos, T.}, year = {2009}, pages = {408-411}, keywords = {Acute aquatic toxicity, Organic molecules, QSAR models, Molecular descriptors, Distance based similarity, Clustering of molecules, Ensemble of multi-regression models, Clustered ensembles}, isbn = {978-0-7354-0685-8}, title = {Improvement of Ensemble of Multi-Regression Structure-Toxicity Models by Clustering of Molecules in Descriptor Space}, keyword = {Acute aquatic toxicity, Organic molecules, QSAR models, Molecular descriptors, Distance based similarity, Clustering of molecules, Ensemble of multi-regression models, Clustered ensembles}, publisher = {American Institute of Physics}, publisherplace = {Kreta, Gr\v{c}ka} }




Contrast
Increase Font
Decrease Font
Dyslexic Font