Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Novel Bioinformatics Tool for the Prediction and Analysis of G-Quadruplexes (CROSBI ID 400415)

Ocjenski rad | diplomski rad

Muhović, Imer Novel Bioinformatics Tool for the Prediction and Analysis of G-Quadruplexes / Marjanović, Damir (mentor); Doluca, Osman (neposredni voditelj). Sarajevo, Bosna i Hercegovina, . 2015

Podaci o odgovornosti

Muhović, Imer

Marjanović, Damir

Doluca, Osman

engleski

Novel Bioinformatics Tool for the Prediction and Analysis of G-Quadruplexes

G-quadruplexes are novel sequences of interest that have recently been implicated as having a regulatory role in the chromosome. Tools exist to predict their possible location, but are sparse in features. Using an artificial neural network we have created a method to predict the melting temperature of such sequences. The creation of this tool went through several phases, we wanted to create an easy to use, and intuitive tool that would be able to select all nucleotides of interest that would be capable of contributing to a G-quadruplex structure, and analyzing their melting temperature, in order to find the ones most likely to form under physiological conditions. We used the python programming language to construct the core algorithm that uses regular expressions to find all stretches of guanine molecules in the given sequence, it then assigns identity values to those g-boxes, and creates a tree structure out of them. The tree is then traversed to obtain all possible combinations of the g-boxes, and thus all possible G-quadruplexes. The duplicates are then pruned, and the G-quadruplexes are run through an artificial neural network which predicts the possible melting temperature of the G-quadruplex. We used the PyBrain library for the Python programming language to construct an artificial neural network, using a previously published dataset of 260 quadruplexes that included data about their sequences, physiological conditions under which they formed G-quadruplexes, and the melting temperatures of the sequences. After analyzing literature we decided on using the melting temperature as a predictor variable for the stability of the quadruplexes. The artificial neural network was trained on a subset of 108 sequences that formed G- quadruplexes in solution with K+ ions, which are thought to be biologically relevant as K+ is present at high concentrations within human cells. We combined the algorithm for finding possible G-quadruplexes with the neural network into a web-tool coded using the Django web development framework. Our webtool allows even novice users to input their sequence data, choose the parameters to their liking and obtain results as to the most likely G-quadruplex to form within the sequence. The obtained results are then presented in a tabular format, and are available for download in multiple spreadsheet formats. Our research into the use of neural networks has left us with the desire for larger, more complete datasets, as 260 sequences are not very much, and leave a lot of variables untested, such as the size of the g-blocks, the nucleotide composition of the loops, and give no mention as to the conformation that the G- quadruplex undertakes. Despite these issues that we have encountered we hope to have created a tool that will be usable by the wider scientific community for years to come.

G-Quadruplex; bioinformatics; artificial neural network; data mining; machine learning; pharmacogenetics

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o izdanju

65

12.06.2015.

obranjeno

Podaci o ustanovi koja je dodijelila akademski stupanj

Sarajevo, Bosna i Hercegovina

Povezanost rada

Biologija