Napredna pretraga

Pregled bibliografske jedinice broj: 773947

Nonlinear Sparse Component Analysis with a Reference: Variable Selection in Genomics and Proteomics


Kopriva, Ivica; Kapitanović, Sanja; Čačev, Tamara
Nonlinear Sparse Component Analysis with a Reference: Variable Selection in Genomics and Proteomics // 12th International Conference, LVA/ICA 2015, Liberec, Czech Republic, August 25-28, 2015, Proceedings / Vincent, Emanuel ; Yeredor, Ari ; Kolodovsky, Zbinyek ; Tichavsky, Petr (ur.).
Heidelberg: Springer, 2015. str. 168-175 (pozvano predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


Naslov
Nonlinear Sparse Component Analysis with a Reference: Variable Selection in Genomics and Proteomics

Autori
Kopriva, Ivica ; Kapitanović, Sanja ; Čačev, Tamara

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
12th International Conference, LVA/ICA 2015, Liberec, Czech Republic, August 25-28, 2015, Proceedings / Vincent, Emanuel ; Yeredor, Ari ; Kolodovsky, Zbinyek ; Tichavsky, Petr - Heidelberg : Springer, 2015, 168-175

ISBN
978-3-319-22482-4

Skup
12th International Conference on Latent Variable Analysis and Signal Separation LVA/ICA 2015

Mjesto i datum
Liberec, Češka, 25-28. 08. 2015

Vrsta sudjelovanja
Pozvano predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
Variable selection;  Nonlinear mixture model;  Empirical kernel maps; Sparse component analysis

Sažetak
Many scenarios occurring in genomics and proteomics involve small number of labeled data and large number of variables. To create prediction models robust to overfitting variable selection is necessary. We propose variable selection method using nonlinear sparse component analysis with a reference representing either negative (healthy) or positive (cancer) class. Thereby, component comprised of cancer related variables is automatically inferred from the geometry of nonlinear mixture model with a reference. Proposed method is compared with 3 supervised and 2 unsupervised variable selection methods on two-class problems using 2 genomic and 2 proteomic datasets. Obtained results, which include analysis of biological relevance of selected genes, are comparable with those achieved by supervised methods. Thus, proposed method can possibly perform better on unseen data of the same cancer type.

Izvorni jezik
Engleski

Znanstvena područja
Matematika, Računarstvo, Temeljne medicinske znanosti



POVEZANOST RADA