Pregled bibliografske jedinice broj: 960280
Building a corpus of the Croatian parliamentary debates using UDPipe open source NLP tools and Neo4j graph database for creation of social ontology model, text classification and extraction of semantic information
Building a corpus of the Croatian parliamentary debates using UDPipe open source NLP tools and Neo4j graph database for creation of social ontology model, text classification and extraction of semantic information // Proceedings of the Conference on Language Technologies & Digital Humanities 2018 / Fišer, D. ; Pančur, A. (ur.).
Ljubljana: Fakulteta za elektrotehniko, Univerza v Ljubljani, 2018. str. 2016-220 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 960280 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Building a corpus of the Croatian parliamentary debates using UDPipe open source NLP tools and Neo4j graph database for creation of social ontology model, text classification and extraction of semantic information
Autori
Perak, Benedikt ; Rodik, Filip
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings of the Conference on Language Technologies & Digital Humanities 2018
/ Fišer, D. ; Pančur, A. - Ljubljana : Fakulteta za elektrotehniko, Univerza v Ljubljani, 2018, 2016-220
ISBN
978-961-06-0111-1
Skup
Jezikovne tehnologije in digitalna humanistika 2018
Mjesto i datum
Ljubljana, Slovenija, 20.09.2018. - 21.09.2018
Vrsta sudjelovanja
Poster
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
corpus linguistics, graph database, universal dependencies, parliamentary debates
Sažetak
This paper describes a process of creating morphosyntactically tagged corpus of the Croatian parliamentary debates using NLP tool UDapi for tokenization, morpho-syntactic parsing and processing Universal Dependencies data to process over 300 thousand transcribed parliamentary speech utterances produced over the period from 2003- 2017 and store the data in a Neo4j graph database.
Izvorni jezik
Engleski
Znanstvena područja
Politologija, Informacijske i komunikacijske znanosti, Filologija
POVEZANOST RADA
Ustanove:
Filozofski fakultet, Rijeka,
Sveučilište u Rijeci
Profili:
Benedikt Perak
(autor)