Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 507924

Building a gold standard for event detection in Croatian


Ljubešić, Nikola; Boras, Damir; Lauc, Tomislava
Building a gold standard for event detection in Croatian // Language Resources and Evaluation Conference / Calzolari, Nicoletta ; Choukri, Khalid ; Maegaard, Bente ; Mariani, Joseph ; Odjik, Jan ; Piperidis, Stelios ; Rosner, Mike ; Tapias, Daniel (ur.).
Valletta: European Language Resources Association (ELRA), 2010. str. 3101-3104 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 507924 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Building a gold standard for event detection in Croatian

Autori
Ljubešić, Nikola ; Boras, Damir ; Lauc, Tomislava

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

ISBN
2-9517408-6-7

Skup
Language Resources and Evaluation Conference

Mjesto i datum
Valletta, Malta, 17.05.2010. - 23.05.2010

Vrsta sudjelovanja
Poster

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
event detection; gold standard; newspaper text; Croatian language

Sažetak
This paper describes the process of building a newspaper corpus annotated with events described in specific documents. The main differ- ence to the corpora built as part of the TDT initiative is that documents are not annotated by topics, but by specific events they describe. Additionally, documents are gathered from sixteen sources and all documents in the corpus are annotated with the corresponding event. The annotation process consists of a browsing and a searching step. Experiments are performed with a threshold that could be used in the browsing step yielding the result of having to browse through only 1% of document pairs for a 2% loss of relevant document pairs. A statistical analysis of the annotated corpus is undertaken showing that most events are described by few documents while just some events are reported by many documents. The inter- annotator agreement measures show high agreement concerning grouping documents into event clusters, but show a much lower agreement concerning the number of events the documents are organized into. An initial experiment is described giving a baseline for further research on this corpus.

Izvorni jezik
Engleski

Znanstvena područja
Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekti:
130-1301679-1380 - Hrvatska rječnička baština i hrvatski europski identitet (Boras, Damir, MZOS ) ( CroRIS)
130-1301799-1999 - Oblikovanje i upravljanje javnim znanjem u informacijskom prostoru (Tuđman, Miroslav, MZOS ) ( CroRIS)

Ustanove:
Filozofski fakultet, Zagreb

Profili:

Avatar Url Nikola Ljubešić (autor)

Avatar Url Tomislava Lauc (autor)

Avatar Url Damir Boras (autor)

Poveznice na cjeloviti tekst rada:

Pristup cjelovitom tekstu rada

Citiraj ovu publikaciju:

Ljubešić, Nikola; Boras, Damir; Lauc, Tomislava
Building a gold standard for event detection in Croatian // Language Resources and Evaluation Conference / Calzolari, Nicoletta ; Choukri, Khalid ; Maegaard, Bente ; Mariani, Joseph ; Odjik, Jan ; Piperidis, Stelios ; Rosner, Mike ; Tapias, Daniel (ur.).
Valletta: European Language Resources Association (ELRA), 2010. str. 3101-3104 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Ljubešić, N., Boras, D. & Lauc, T. (2010) Building a gold standard for event detection in Croatian. U: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odjik, J., Piperidis, S., Rosner, M. & Tapias, D. (ur.)Language Resources and Evaluation Conference.
@article{article, author = {Ljube\v{s}i\'{c}, Nikola and Boras, Damir and Lauc, Tomislava}, year = {2010}, pages = {3101-3104}, keywords = {event detection, gold standard, newspaper text, Croatian language}, isbn = {2-9517408-6-7}, title = {Building a gold standard for event detection in Croatian}, keyword = {event detection, gold standard, newspaper text, Croatian language}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Valletta, Malta} }
@article{article, author = {Ljube\v{s}i\'{c}, Nikola and Boras, Damir and Lauc, Tomislava}, year = {2010}, pages = {3101-3104}, keywords = {event detection, gold standard, newspaper text, Croatian language}, isbn = {2-9517408-6-7}, title = {Building a gold standard for event detection in Croatian}, keyword = {event detection, gold standard, newspaper text, Croatian language}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Valletta, Malta} }




Contrast
Increase Font
Decrease Font
Dyslexic Font