Building a corpus of the Croatian parliamentary debates using UDPipe open source NLP tools and Neo4j graph database for creation of social ontology model, text classification and extraction of semantic information

Perak, Benedikt; Rodik, Filip

Pregled bibliografske jedinice broj: 960280

Building a corpus of the Croatian parliamentary debates using UDPipe open source NLP tools and Neo4j graph database for creation of social ontology model, text classification and extraction of semantic information

Perak, Benedikt; Rodik, Filip

Building a corpus of the Croatian parliamentary debates using UDPipe open source NLP tools and Neo4j graph database for creation of social ontology model, text classification and extraction of semantic information // Proceedings of the Conference on Language Technologies & Digital Humanities 2018 / Fišer, D. ; Pančur, A. (ur.).
Ljubljana: Fakulteta za elektrotehniko, Univerza v Ljubljani, 2018. str. 2016-220 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)

CROSBI ID: 960280 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Building a corpus of the Croatian parliamentary debates using UDPipe open source NLP tools and Neo4j graph database for creation of social ontology model, text classification and extraction of semantic information

Autori
Perak, Benedikt ; Rodik, Filip

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of the Conference on Language Technologies & Digital Humanities 2018 / Fišer, D. ; Pančur, A. - Ljubljana : Fakulteta za elektrotehniko, Univerza v Ljubljani, 2018, 2016-220

ISBN
978-961-06-0111-1

Skup
Jezikovne tehnologije in digitalna humanistika 2018

Mjesto i datum
Ljubljana, Slovenija, 20.09.2018. - 21.09.2018

Vrsta sudjelovanja
Poster

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
corpus linguistics, graph database, universal dependencies, parliamentary debates

Sažetak
This paper describes a process of creating morphosyntactically tagged corpus of the Croatian parliamentary debates using NLP tool UDapi for tokenization, morpho-syntactic parsing and processing Universal Dependencies data to process over 300 thousand transcribed parliamentary speech utterances produced over the period from 2003- 2017 and store the data in a Neo4j graph database.

Izvorni jezik
Engleski

Znanstvena područja
Politologija, Informacijske i komunikacijske znanosti, Filologija

POVEZANOST RADA

Projekti:
UNIRI Inicijalna potpora 2017 - 1016
UIP-05-2017-9219

Ustanove:
Filozofski fakultet, Rijeka,
Sveučilište u Rijeci

Profili:

Benedikt Perak (autor)

Poveznice na cjeloviti tekst rada:

www.sdjt.si

CROSBI Hrvatska znanstvena bibliografija

Pregled bibliografske jedinice broj: 960280

Building a corpus of the Croatian parliamentary debates using UDPipe open source NLP tools and Neo4j graph database for creation of social ontology model, text classification and extraction of semantic information

Poveznice na cjeloviti tekst rada:

Citiraj ovu publikaciju:

Pregled bibliografske jedinice broj: 960280

Building a corpus of the Croatian parliamentary debates using UDPipe open source NLP tools and Neo4j graph database for creation of social ontology model, text classification and extraction of semantic information

Poveznice na cjeloviti tekst rada:

Citiraj ovu publikaciju:

Podijeli: