Pregled bibliografske jedinice broj: 1089211
Topic modelling using Latent Dirichlet Allocation for bibliography of Slovenian researchers
Topic modelling using Latent Dirichlet Allocation for bibliography of Slovenian researchers // Book of Abstracts of the ISCCRO - International Statistical Conference in Croatia / Žmuk, Berislav ; Čeh Časni, Anita (ur.).
Zagreb: Hrvatsko statističko društvo, 2020. str. 17-17 (predavanje, međunarodna recenzija, sažetak, znanstveni)
CROSBI ID: 1089211 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Topic modelling using Latent Dirichlet Allocation for bibliography of Slovenian researchers
Autori
Buhin Pandur, Maja ; Dobša, Jasminka ; Kronegger, Luka
Vrsta, podvrsta i kategorija rada
Sažeci sa skupova, sažetak, znanstveni
Izvornik
Book of Abstracts of the ISCCRO - International Statistical Conference in Croatia
/ Žmuk, Berislav ; Čeh Časni, Anita - Zagreb : Hrvatsko statističko društvo, 2020, 17-17
Skup
3rd International Statistical Conference in Croatia (ISCCRO'20)
Mjesto i datum
Zagreb, Hrvatska, 15.10.2020. - 16.10.2020
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
interdisciplinarity, Latent Dirichlet Allocation, Slovenian research area
Sažetak
Topic modelling using Latent Dirichlet Allocation (LDA) is one of the most popular text mining techniques for topic modelling. By this method, we can obtain main topics of a document collection by modelling each document as a probability distribution which indicates the likelihood that a certain document expresses a specific topic. Topics are presented as sets of keywords. Our experiment will be conducted on a data set obtained from COBISS (SICRIS) bibliographic system and consists of complete bibliographies of researchers having Slovenian research ID, that was according to SICRIS at any time affiliated with Institute “Jožef Stefan“ in Ljubljana. Topics obtained by LDA will be compared to existing taxonomy of scientific fields for the Slovenian research area in terms of keywords or the most used words for a set of categories. The final goal of the research is the identification of potential interdisciplinary topics in the Slovenian research area. Also, in the future work, we plan to use social network analysis for the identification of existing interdisciplinary research by analysis of the network of co-authorship.
Izvorni jezik
Engleski
Znanstvena područja
Informacijske i komunikacijske znanosti, Interdisciplinarne društvene znanosti
POVEZANOST RADA
Ustanove:
Fakultet organizacije i informatike, Varaždin