Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 121744

Toward generating simpler QSAR models: Nonlinear multivariate regression versus several neural network ensembles and some related methods


Lučić, Bono; Nadramija, Damir; Bašic, Ivan; Trinajstić, Nenad
Toward generating simpler QSAR models: Nonlinear multivariate regression versus several neural network ensembles and some related methods // Journal of chemical information and computer sciences, 43 (2003), 4; 1094-1102 (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 121744 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Toward generating simpler QSAR models: Nonlinear multivariate regression versus several neural network ensembles and some related methods

Autori
Lučić, Bono ; Nadramija, Damir ; Bašic, Ivan ; Trinajstić, Nenad

Izvornik
Journal of chemical information and computer sciences (0095-2338) 43 (2003), 4; 1094-1102

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
Variable selection; derivatives; outperforms

Sažetak
In this study we want to test whether a simple modeling procedure used in the field of QSAR/QSPR can produce simple models that will be, at the same time, as accurate as robust Neural Network Ensemble (NNE) ones. We present results of application of two procedures for generating/selecting simple linear and nonlinear multiregression (MR) models: (1) method for selecting the best possible MR models (named as CROMRsel) and (2) Genetic Function Approximation (GFA) method from the Cerius2 program package. The obtained MR models are strictly compared with several NNE models. For the comparison we selected four QSAR data sets previously studied by NNE (Tetko et al. J. Chem. Inf. Comput. Sci. 1996, 36, 794-803. Kovalishyn et al. J. Chem. Inf. Comput. Sci. 1998, 38, 651-659.): (1) 51 benzodiazepine derivatives, (2) 37 carboquinone derivatives, (3) 74 pyrimidines, and (4) 31 antimycin analogues. These data sets were parametrized with 7, 6, 27, and 53 descriptors, respectively. Modeled properties were anti-pentylenetetrazole activity, antileukemic activity, inhibition constants to dihydrofolate reductase from MB1428 E. coli, and antifilarial activity, respectively. Nonlinearities were introduced into the MR models through 2-fold and/or 3-fold cross-products of initial (linear) descriptors. Then, using the CROMRsel and GFA programs (J. Chem. Inf. Comput. Sci. 1999, 39, 121-132) the sets of I (I 8, in this paper) the best descriptors (according to the fit and leave-one-out correlation coefficients) were selected for multiregression models. Two classes of models were obtained: (1) linear or nonlinear MR models which were generated starting from the complete set of descriptors, and (2) nonlinear MR models which were generated starting from the same set of descriptors that was used in the NNE modeling. In addition, the descriptor selection method from CROMRsel was compared with the GFA method included in the QSAR module of the Cerius2 program. For each data set it has been found that the MR models have better cross-validated statistical parameters than the corresponding NNE models and that CROMRsel selects somewhat better MR models than the GFA method. MR models are also much simpler than NNEs, which is the important surprising fact, and, additionally, express calculated dependencies in a functional form. Moreover, MR models were shown to be better than all other models obtained by different methods on the same data sets ("old" multivariate regressions, functional-link-net models, back-propagation neural networks, genetic algorithm, and partial least squares models). This study also indicated that the robust NNE models cannot generate good models when applied on small data sets, suggesting that it is perhaps better to apply robust methods (like NNE ones) on larger data sets.

Izvorni jezik
Engleski

Znanstvena područja
Kemija



POVEZANOST RADA


Projekt / tema
0098034

Ustanove
Institut "Ruđer Bošković", Zagreb

Profili:

Avatar Url Damir Nadramija (autor)

Avatar Url Bono Lučić (autor)

Avatar Url Nenad Trinajstić (autor)

Citiraj ovu publikaciju

Lučić, Bono; Nadramija, Damir; Bašic, Ivan; Trinajstić, Nenad
Toward generating simpler QSAR models: Nonlinear multivariate regression versus several neural network ensembles and some related methods // Journal of chemical information and computer sciences, 43 (2003), 4; 1094-1102 (međunarodna recenzija, članak, znanstveni)
Lučić, B., Nadramija, D., Bašic, I. & Trinajstić, N. (2003) Toward generating simpler QSAR models: Nonlinear multivariate regression versus several neural network ensembles and some related methods. Journal of chemical information and computer sciences, 43 (4), 1094-1102.
@article{article, year = {2003}, pages = {1094-1102}, keywords = {Variable selection, derivatives, outperforms}, journal = {Journal of chemical information and computer sciences}, volume = {43}, number = {4}, issn = {0095-2338}, title = {Toward generating simpler QSAR models: Nonlinear multivariate regression versus several neural network ensembles and some related methods}, keyword = {Variable selection, derivatives, outperforms} }

Časopis indeksira:


  • Current Contents Connect (CCC)
  • Web of Science Core Collection (WoSCC)
    • Science Citation Index Expanded (SCI-EXP)
    • SCI-EXP, SSCI i/ili A&HCI
  • Scopus
  • MEDLINE


Uključenost u ostale bibliografske baze podataka:


  • Chemical Abstracts