Pregled bibliografske jedinice broj: 835299
Postupci i statistike slogovanja za talijanski jezik
Postupci i statistike slogovanja za talijanski jezik, 2016., diplomski rad, preddiplomski, Odjel za informatiku, Rijeka
CROSBI ID: 835299 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Postupci i statistike slogovanja za talijanski jezik
(Syllabification and statistics of the Italian language)
Autori
Pokos, Marija
Vrsta, podvrsta i kategorija rada
Ocjenski radovi, diplomski rad, preddiplomski
Fakultet
Odjel za informatiku
Mjesto
Rijeka
Datum
08.09
Godina
2016
Stranica
51
Mentor
Martinčić-Ipšić, Sanda
Ključne riječi
slog; slogovanje; ortografski zapis; fonetski zapis; pogreška; fonem
(syllable; syllabification; orthographic notation; phonetic notation; error; phoneme)
Sažetak
Disunion of words in syllables, often named syllabification is a process that is used for disunioning one word in syllables. The main objective of this research is a production of parser and implementation of the rules for disunioning words in syllables in Italian language for existing Italian dictionary. Dictionary consists of 440084 words, one part of original Italian words, the other part of words adopted from other languages and it consists of three columns. First it was needed to organize the dictionary in two folders in mode that one folder contains a word and its part of speech, and the other folder consists of a word and that same word divided in syllables with phonetics signs specified in the initial file. The next step was to reorganize the third column that look like this: (((a1) 1) ((b a) 0) ((k i) 0))), the first parsing moved brackets and zeros and added hyphen between the syllables. The ones remained because they marked the stress. After parsing a string that look like this was obtained: a11-ba-ki. After that it was required to implement the rules for syllabification that will divide words in syllables but in orthographic notation. It is noticed that we have two different notation of words, one orthographic and other phonetic notation. Because of that it is needed to reorganize the initial phonetic notation in orthographic notation so we can compare initial notation with the one we get. Backtracking of words in orthographic notation is required to do for each letter separately because there are phonetics signs that are equal for different letters. For example, phonetic sign /dZi/ for the letter g turns in orthographic [gi], and for letter j turns in orthographic [ji]. It is necessary to say that the phonetic notation that is used in this dictionary is not completely original, but is a little bit modified because of easier understanding. Because of the difference in notation, errors in backtracking from phonetic to orthographic notation appear and some errors occur during division of words implementing rules. Analysis show the difference between phonetic and orthographic notation and percentages of errors amount: for phonetic and orthographic notation: 77, 29% of error ; for orthographic notation: 42, 12% of error ; after parsing 35, 17% of incorrect notation were corrected.
Izvorni jezik
Hrvatski
Znanstvena područja
Informacijske i komunikacijske znanosti
POVEZANOST RADA
Ustanove:
Fakultet informatike i digitalnih tehnologija, Rijeka
Profili:
Sanda Martinčić - Ipšić
(mentor)