Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 375351

Improving Part-of-Speech Tagging Accuracy for Croatian by Morphological Analysis


Agić, Željko; Tadić, Marko; Dovedan, Zdravko
Improving Part-of-Speech Tagging Accuracy for Croatian by Morphological Analysis // Informatica, 32 (2008), 4; 445-451 (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 375351 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Improving Part-of-Speech Tagging Accuracy for Croatian by Morphological Analysis

Autori
Agić, Željko ; Tadić, Marko ; Dovedan, Zdravko

Izvornik
Informatica (0350-5596) 32 (2008), 4; 445-451

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
part-of-speech tagging ; morphological analysis ; inflectional lexicon ; Croatian language Received: May

Sažetak
This paper investigates several methods of combining a second order hidden Markov model part-of-speech (morphosyntactic) tagger and a high-coverage inflectional lexicon for Croatian. Our primary motivation was to improve tagging accuracy of Croatian texts by using our newly-developed tagger CroTag, currently in beta-version. We also wanted to compare its tagging results – both standalone and utilizing the morphological lexicon – to the ones previously described in (Agić and Tadić 2006), provided by the TnT statistical tagger which we used as a reference point having in mind that both implement the same tagging procedure. At the beginning we explain the basic idea behind the experiment, its motivation and importance from the perspective of processing the Croatian language. We also describe tools – namely tagger and lexicon – and language resources used in the experiment, including their implementation method and input/output format details that were of importance. With the basics presented, we describe in theory four possible methods of combining these resources and tools with respect to their operating paradigm, input and production capabilities and then put these ideas to test using the F-measure evaluation framework. Results are then discussed in detail and conclusions and future work plans are presented.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti, Filologija



POVEZANOST RADA


Projekti:
036-1300646-1986 - Otkrivanje znanja u tekstnim podacima (Dalbelo-Bašić, Bojana, MZOS ) ( POIROT)
130-1300646-0645 - Hrvatski jezični resursi i njihovo obilježavanje (Tadić, Marko, MZOS ) ( POIROT)
130-1300646-1776 - Računalna sintaksa hrvatskoga jezika (Dovedan Han, Zdravko, MZOS ) ( POIROT)

Ustanove:
Filozofski fakultet, Zagreb

Profili:

Avatar Url Zdravko Dovedan Han (autor)

Avatar Url Marko Tadić (autor)

Avatar Url Željko Agić (autor)

Poveznice na cjeloviti tekst rada:

Pristup cjelovitom tekstu rada www.informatica.si

Citiraj ovu publikaciju:

Agić, Željko; Tadić, Marko; Dovedan, Zdravko
Improving Part-of-Speech Tagging Accuracy for Croatian by Morphological Analysis // Informatica, 32 (2008), 4; 445-451 (međunarodna recenzija, članak, znanstveni)
Agić, Ž., Tadić, M. & Dovedan, Z. (2008) Improving Part-of-Speech Tagging Accuracy for Croatian by Morphological Analysis. Informatica, 32 (4), 445-451.
@article{article, year = {2008}, pages = {445-451}, keywords = {part-of-speech tagging, morphological analysis, inflectional lexicon, Croatian language Received: May}, journal = {Informatica}, volume = {32}, number = {4}, issn = {0350-5596}, title = {Improving Part-of-Speech Tagging Accuracy for Croatian by Morphological Analysis}, keyword = {part-of-speech tagging, morphological analysis, inflectional lexicon, Croatian language Received: May} }
@article{article, year = {2008}, pages = {445-451}, keywords = {part-of-speech tagging, morphological analysis, inflectional lexicon, Croatian language Received: May}, journal = {Informatica}, volume = {32}, number = {4}, issn = {0350-5596}, title = {Improving Part-of-Speech Tagging Accuracy for Croatian by Morphological Analysis}, keyword = {part-of-speech tagging, morphological analysis, inflectional lexicon, Croatian language Received: May} }

Časopis indeksira:


  • Web of Science Core Collection (WoSCC)
    • SCI-EXP, SSCI i/ili A&HCI
    • Emerging Sources Citation Index (ESCI)
  • Scopus


Uključenost u ostale bibliografske baze podataka::


  • ABI/INFORM
  • Citeseer
  • COBISS
  • Compendex
  • Computer & Information Systems Abstracts
  • Computer Database
  • Computer Science Index
  • DBLP Computer Science Bibliography
  • Directory of Open Access Journals
  • InfoTrac OneFile
  • Inspec
  • Linguistic and Language Behaviour Abstracts
  • Mathematical Reviews, MatSciNet, MatSci on SilverPlatter and Current Mathematical Publications
  • Scopus
  • Zentralblatt Math





Contrast
Increase Font
Decrease Font
Dyslexic Font