Pregled bibliografske jedinice broj: 1216658
Estimation of model and variable complexity and quality
Estimation of model and variable complexity and quality // Regional Biophysics Conference - RBC2022
Pečuh, Mađarska, 2022. (pozvano predavanje, međunarodna recenzija, ostalo, znanstveni)
CROSBI ID: 1216658 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Estimation of model and variable complexity and quality
Autori
Lučić, Bono ; Bojović, Viktor ; Kraljević, Antonija ; Batista, Jadranko
Vrsta, podvrsta i kategorija rada
Sažeci sa skupova, ostalo, znanstveni
Skup
Regional Biophysics Conference - RBC2022
Mjesto i datum
Pečuh, Mađarska, 22.08.2022. - 26.08.2022
Vrsta sudjelovanja
Pozvano predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
variable complexity, random agreement, dichotomous variable, permutation entropy, model quality
Sažetak
When developing structure-property molecular models, it is desirable to pay attention to their simplicity and to include only informative variables (descriptors) in the models, i.e. those that contain useful and interpretable structural information. Moreover, almost every dataset used for model development contains some information that is redundant. Therefore, the protocol we use to validate the quality of the model is extremely important. One of the parts of the model validation procedures is the estimation and evaluation of the random accuracy resulting from the complexity (i.e. monotonicity) of the input data and the model quality, the background of which is mainly represented by the number of parameters optimised in the model. The complexity of the dichotomous variables representing the molecular descriptors and the property of the molecules to be modelled, as well as the values predicted by the model, will be assessed by estimating the number of possible permutations (permutation entropy) of values of a variable. Formulas were derived for new statistical measures that can be used to assess the quality and complexity of classification models and variables. In addition, formulae were derived for calculating the minimum and maximum possible accuracy/agreement of the model and for the average random accuracy/agreement. If we consider the case where predicted and experimental variables have identical distributions, we can obtain expressions for measuring the monotonicity of a variable. Recent results from this area of research will be presented and illustrated with examples of models developed to predict the structure of membrane proteins, the toxicity of organic compounds and the folding rates of proteins.
Izvorni jezik
Engleski
Znanstvena područja
Fizika, Kemija, Interdisciplinarne prirodne znanosti
Napomena
Basic grant of MZO/RBI to Bono Lučić
DOK-2018-01-9531
https://www.rbc2022.hu/i-programme.php
POVEZANOST RADA
Projekti:
EK-KF-KK.01.1.1.01.0002 - Bioprospecting Jadranskog mora (Jerković, Igor; Dragović-Uzelac, Verica; Šantek, Božidar; Čož-Rakovac, Rozelinda; Kraljević Pavelić, Sandra; Jokić, Stela, EK ) ( CroRIS)
HRZZ-DOK-2018-01-9531 - Bioprospecting Jadranskog mora (Lučić, Bono, HRZZ - 2018-01) ( CroRIS)
KK.01.1.1.01.0009 - Napredne metode i tehnologije u znanosti o podatcima i kooperativnim sustavima (EK )
Ustanove:
Institut "Ruđer Bošković", Zagreb