Distance Measures and Machine Learning Approaches for Codon Usage Analyses

Supek, Fran; Šmuc, Tomislav

izvor podataka: crosbi ✓

Distance Measures and Machine Learning Approaches for Codon Usage Analyses (CROSBI ID 44075)

Prilog u knjizi | izvorni znanstveni rad

Supek, Fran ; Šmuc, Tomislav Distance Measures and Machine Learning Approaches for Codon Usage Analyses // Codon Evolution - Mechanisms and Models / Cannarozzi, Gina ; Schneider, Adrian (ur.). Oxford: Oxford University Press, 2011. str. 229-244

Podaci o odgovornosti

Autori

Supek, Fran ; Šmuc, Tomislav

Osnovni podaci na izvornom jeziku
Osnovni podaci na ostalim jezicima

Jezik

engleski

Naslov

Distance Measures and Machine Learning Approaches for Codon Usage Analyses

Sažetak

Unequal use of synonymous codons is a widespread phenomenon caused largely by directional mutation pressures, but also by natural selection for speed and/or accuracy of protein translation. Much effort was dedicated to investigate whether this 'translational selection' had an influence on codon choice of highly expressed genes in various genomes. In such analyses genes are typically represented as vectors of codon frequencies, and the data analyzed using multivariate techniques, commonly either (a) dimensionality reduction, e.g. correspondence analysis, or (b) distance measures in the codon frequency space, such as the codon adaptation index (CAI). Such representations of data can be challenging as genes are too short to allow precise estimation of codon frequencies, introducing noise and consequently leading to serious artifacts in some commonly used methods. A supervised machine learning approach, as embodied in the use of a classifier, provides an alternative more robust to noise and also more sensitive in detecting codon biases. We describe a Random Forest-based computational framework that enables control over confounding factors (here, the background nucleotide substitution patterns) while reliably detecting translational selection, demonstrated on a large set of prokaryotic genomes.

Ključne riječi

codon bias, supervised machine learning, translational selection, highly expressed genes, Random Forest

Napomena

nije evidentirano

Jezik

nije evidentirano

Naslov

nije evidentirano

Sažetak

nije evidentirano

Ključne riječi

nije evidentirano

Napomena

nije evidentirano

Podaci o prilogu

Stranice rada

229-244.

Status objave rada

objavljeno

Podaci o knjizi

Knjiga u kojoj je prilog objavljen

Codon Evolution - Mechanisms and Models

Urednici

Cannarozzi, Gina ; Schneider, Adrian

Izdavač

Oxford: Oxford University Press

Godina izdavanja

2011.

ISBN

9780199601665

Povezanost rada

Povezane osobe

Tomislav Šmuc (autor/i)

Fran Supek (autor/i)

Povezane ustanove

Institut Ruđer Bošković (098) (autorova ustanova)

Povezani projekti

Strojno učenje prediktivnih modela u računalnoj biologiji (rezultat rada na projektu)

Područje

Računarstvo, Biologija

Poveznice

codonbook.com

ukcatalogue.oup.com