Disambiguation of Homograms in a Pitch Accent Language

Nacinovic Prskalo, Lucia; Brkic Bakaric, Marija

izvor podataka: crosbi !

Disambiguation of Homograms in a Pitch Accent Language (CROSBI ID 657741)

Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija

Nacinovic Prskalo, Lucia ; Brkic Bakaric, Marija Disambiguation of Homograms in a Pitch Accent Language // Proceedings of 2017 International Conference on Computer Science and Artificial Intelligence CSAI 2017. New York (NY): The Association for Computing Machinery (ACM), 2017. str. 32-37 doi: 10.1145/3168390.3168409

Podaci o odgovornosti

Autori

Nacinovic Prskalo, Lucia ; Brkic Bakaric, Marija

Osnovni podaci na izvornom jeziku
Osnovni podaci na ostalim jezicima

Jezik

engleski

Naslov

Disambiguation of Homograms in a Pitch Accent Language

Sažetak

The Croatian language is a pitch-accent language in which the tone contour realized in the stressed syllable carries the lexical information. Therefore, in some cases, different lexical accent gives the word a different meaning. In such cases, the ambiguity of the word in written texts where accents are not usually marked, can be solved by determining the appropriate accent. There are also cases when various basic and derived forms of words have different meanings, different morphosyntactic tags and possibly different accents. When words have same written forms, but different meanings, they are called homograms. In order to resolve the ambiguity of homograms, we created a lexicon of homograms which is comprised of all Croatian nouns of different gender which have the same written forms (if accents are not marked), but different meanings, morphosyntactic tags and possibly different accents. The lexicon consists of 19, 366 entries and 3, 460 unique homograms. Each entry in the lexicon comprises of the homogram (unaccented word), the corresponding MSD, the accented word and the accented lemma. The obtained lexicon enables us to identificate and disambiguate homograms within the corpus efficiently and accurately.

Ključne riječi

disambiguation of homograms ; lexicon of homograms ; word sense disambiguation ; pitch accent language ; homogram, homograph ; homophone ; Croatian language

Napomena

nije evidentirano

Jezik

nije evidentirano

Naslov

nije evidentirano

Sažetak

nije evidentirano

Ključne riječi

nije evidentirano

Napomena

nije evidentirano

Podaci o prilogu

Stranice rada

32-37.

Godina izdavanja

2017.

Status objave rada

objavljeno

DOI

10.1145/3168390.3168409

Podaci o matičnoj publikaciji

Naslov

Proceedings of 2017 International Conference on Computer Science and Artificial Intelligence CSAI 2017

Izdavač

New York (NY): The Association for Computing Machinery (ACM)

ISBN

978-1-4503-5392-2

Podaci o skupu

Skup

International Conference on Computer Science and Artificial Intelligence (CSAI 2017)

Vrsta sudjelovanja

predavanje

Datum održavanja skupa

05.12.2017-07.12.2017

Mjesto održavanja skupa

Jakarta, Indonezija

Povezanost rada

Povezane osobe

Lucia Načinović Prskalo (autor/i)

Marija Brkić Bakarić (autor/i)

Povezane ustanove

Sveučilište u Rijeci, Fakultet informatike i digitalnih tehnologija (318) (autorova ustanova)

Područje

Informacijske i komunikacijske znanosti

Poveznice

doi.org

dl.acm.org

Indeksiranost

Scopus