Pregled bibliografske jedinice broj: 1016109
Effect of information leakage and method of splitting (rational and random) on external predictive ability and behavior of different statistical parameters of QSAR model
Effect of information leakage and method of splitting (rational and random) on external predictive ability and behavior of different statistical parameters of QSAR model // Medicinal chemistry research, 24 (2015), 3; 1241-1264 doi:10.1007/s00044-014-1193-8 (međunarodna recenzija, članak, znanstveni)
CROSBI ID: 1016109 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Effect of information leakage and method of splitting (rational and random) on external predictive ability and behavior of different statistical parameters of QSAR model
Autori
Masand, Vijay H. ; Mahajan, Devidas T. ; Nazeruddin, Gulam M. ; Ben Hadda, Taibi ; Rastija, Vesna ; Alfeefy, Ahmed M.
Izvornik
Medicinal chemistry research (1054-2523) 24
(2015), 3;
1241-1264
Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni
Ključne riječi
QSAR ; external validation ; statistical parameters ; splitting methods ; predictivity
Sažetak
Quantitative Structure-Activity Relationship not only provides guidelines regarding structural features responsible for biological activity but it can be used also for prediction of desired activity prior to synthesis of untested chemicals. Therefore, an appropriate validation of any QSAR is of utmost importance to judge its external predictive ability. Generally, internal and external validations (preferred by many) are used in the absence of a true external dataset. The model developed using external method may not be reliable as it may not capture all essential features required for the particular SAR due to omission of some compounds, especially for small datasets. In external validation, the splitting is done either rationally or in random manner before descriptor selection. In the present study, rational splitting of dataset was performed using a novel method and its effect on statistical parameters was analyzed. The analysis reveals that the predictive ability of a QSAR model is sensitive toward (1) the method of splitting and (2) distribution of the training and the prediction sets. In addition, purposeful selection can be used to influence the statistical parameters ; therefore, external validation based on single split is insufficient to guarantee the true predictive ability of a QSAR model. Besides, it appears that the selection of descriptors prior to splitting (information leakage) has little role to play in deciding external predictivity of the model. The present study reveals that as many as possible statistical parameters should be examined along with boot-strapping instead of single external validation.
Izvorni jezik
Engleski
Znanstvena područja
Poljoprivreda (agronomija), Biotehnologija
Citiraj ovu publikaciju:
Časopis indeksira:
- Current Contents Connect (CCC)
- Web of Science Core Collection (WoSCC)
- Science Citation Index Expanded (SCI-EXP)
- SCI-EXP, SSCI i/ili A&HCI
- Scopus