Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 455001

Improving Chunking Accuracy on Croatian Texts by Morphosyntactic Tagging


Vučković, Kristina; Agić, Željko; Tadić, Marko
Improving Chunking Accuracy on Croatian Texts by Morphosyntactic Tagging // Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010) / Calzolari, Nicoletta ; Choukri, Khalid ; Maegaard, Bente ; Mariani, Joseph ; Odjik, Jan ; Piperidis, Stelios ; Rosner, Mike ; Tapias, Daniel (ur.).
Valletta: European Language Resources Association (ELRA), 2010. str. 1944-1949 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 455001 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Improving Chunking Accuracy on Croatian Texts by Morphosyntactic Tagging

Autori
Vučković, Kristina ; Agić, Željko ; Tadić, Marko

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010) / Calzolari, Nicoletta ; Choukri, Khalid ; Maegaard, Bente ; Mariani, Joseph ; Odjik, Jan ; Piperidis, Stelios ; Rosner, Mike ; Tapias, Daniel - Valletta : European Language Resources Association (ELRA), 2010, 1944-1949

ISBN
2-9517408-6-7

Skup
Seventh International Conference on Language Resources and Evaluation (LREC 2010)

Mjesto i datum
Valletta, Malta, 17.05.2010. - 23.05.2010

Vrsta sudjelovanja
Poster

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
chunking; partial parsing; morphosyntactic tagging

Sažetak
In this paper, we present the results of an experiment with utilizing a stochastic morphosyntactic tagger as a pre-processing module of a rule-based chunker and partial parser for Croatian in order to raise its overall chunking and partial parsing accuracy on Croatian texts. In order to conduct the experiment, we have manually chunked and partially parsed 459 sentences from the Croatia Weekly 100 kw newspaper sub-corpus taken from the Croatian National Corpus, that were previously also morphosyntactically disambiguated and lemmatized. Due to the lack of resources of this type, these sentences were designated as a temporary chunking and partial parsing gold standard for Croatian. We have then evaluated the chunker and partial parser in three different scenarios: (1) chunking previously morphosyntactically untagged text, (2) chunking text that was tagged using the stochastic morphosyntactic tagger for Croatian and (3) chunking manually tagged text. The obtained F1- scores for the three scenarios were, respectively, 0.875 (P: 0.826, R: 0.930), 0.900 (P: 0.866, R: 0.937) and 0.930 (P: 0.912, R: 0.949). The paper provides the description of language resources and tools used in the experiment, its setup and discussion of results and perspectives for future work.

Izvorni jezik
Engleski

Znanstvena područja
Informacijske i komunikacijske znanosti, Filologija



POVEZANOST RADA


Projekti:
130-1300646-0645 - Hrvatski jezični resursi i njihovo obilježavanje (Tadić, Marko, MZOS ) ( CroRIS)
130-1300646-1776 - Računalna sintaksa hrvatskoga jezika (Dovedan Han, Zdravko, MZOS ) ( CroRIS)

Ustanove:
Filozofski fakultet, Zagreb

Profili:

Avatar Url Marko Tadić (autor)

Avatar Url Željko Agić (autor)

Avatar Url Kristina Kocijan (autor)

Poveznice na cjeloviti tekst rada:

Pristup cjelovitom tekstu rada www.lrec-conf.org

Citiraj ovu publikaciju:

Vučković, Kristina; Agić, Željko; Tadić, Marko
Improving Chunking Accuracy on Croatian Texts by Morphosyntactic Tagging // Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010) / Calzolari, Nicoletta ; Choukri, Khalid ; Maegaard, Bente ; Mariani, Joseph ; Odjik, Jan ; Piperidis, Stelios ; Rosner, Mike ; Tapias, Daniel (ur.).
Valletta: European Language Resources Association (ELRA), 2010. str. 1944-1949 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Vučković, K., Agić, Ž. & Tadić, M. (2010) Improving Chunking Accuracy on Croatian Texts by Morphosyntactic Tagging. U: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odjik, J., Piperidis, S., Rosner, M. & Tapias, D. (ur.)Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010).
@article{article, author = {Vu\v{c}kovi\'{c}, Kristina and Agi\'{c}, \v{Z}eljko and Tadi\'{c}, Marko}, year = {2010}, pages = {1944-1949}, keywords = {chunking, partial parsing, morphosyntactic tagging}, isbn = {2-9517408-6-7}, title = {Improving Chunking Accuracy on Croatian Texts by Morphosyntactic Tagging}, keyword = {chunking, partial parsing, morphosyntactic tagging}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Valletta, Malta} }
@article{article, author = {Vu\v{c}kovi\'{c}, Kristina and Agi\'{c}, \v{Z}eljko and Tadi\'{c}, Marko}, year = {2010}, pages = {1944-1949}, keywords = {chunking, partial parsing, morphosyntactic tagging}, isbn = {2-9517408-6-7}, title = {Improving Chunking Accuracy on Croatian Texts by Morphosyntactic Tagging}, keyword = {chunking, partial parsing, morphosyntactic tagging}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Valletta, Malta} }




Contrast
Increase Font
Decrease Font
Dyslexic Font