Estimation of model and variable complexity and quality (CROSBI ID 723381)
Prilog sa skupa u zborniku | sažetak izlaganja sa skupa | međunarodna recenzija
Podaci o odgovornosti
Lučić, Bono ; Bojović, Viktor ; Kraljević, Antonija ; Batista, Jadranko
engleski
Estimation of model and variable complexity and quality
When developing structure-property molecular models, it is desirable to pay attention to their simplicity and to include only informative variables (descriptors) in the models, i.e. those that contain useful and interpretable structural information. Moreover, almost every dataset used for model development contains some information that is redundant. Therefore, the protocol we use to validate the quality of the model is extremely important. One of the parts of the model validation procedures is the estimation and evaluation of the random accuracy resulting from the complexity (i.e. monotonicity) of the input data and the model quality, the background of which is mainly represented by the number of parameters optimised in the model. The complexity of the dichotomous variables representing the molecular descriptors and the property of the molecules to be modelled, as well as the values predicted by the model, will be assessed by estimating the number of possible permutations (permutation entropy) of values of a variable. Formulas were derived for new statistical measures that can be used to assess the quality and complexity of classification models and variables. In addition, formulae were derived for calculating the minimum and maximum possible accuracy/agreement of the model and for the average random accuracy/agreement. If we consider the case where predicted and experimental variables have identical distributions, we can obtain expressions for measuring the monotonicity of a variable. Recent results from this area of research will be presented and illustrated with examples of models developed to predict the structure of membrane proteins, the toxicity of organic compounds and the folding rates of proteins.
variable complexity, random agreement, dichotomous variable, permutation entropy, model quality
Basic grant of MZO/RBI to Bono Lučić DOK-2018-01-9531 https://www.rbc2022.hu/i-programme.php
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
2022.
objavljeno
Podaci o matičnoj publikaciji
Podaci o skupu
Regional Biophysics Conference - RBC2022
pozvano predavanje
22.08.2022-26.08.2022
Pečuh, Mađarska