Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 607634

Normalization of Non-Standard Words in Croatian Texts


Beliga, Slobodan; Pobar, Miran; Martinčić-Ipšić, Sanda
Normalization of Non-Standard Words in Croatian Texts // Text, Speech and Dialogue extension to Lecture Notes in Artificial Intelligence LNAI6836 / Hebernal, Ivan ; Matoušek, Vaclav (ur.).
Plzeň: University of West Bohemia, 2011. str. 1-8 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 607634 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Normalization of Non-Standard Words in Croatian Texts

Autori
Beliga, Slobodan ; Pobar, Miran ; Martinčić-Ipšić, Sanda

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Text, Speech and Dialogue extension to Lecture Notes in Artificial Intelligence LNAI6836 / Hebernal, Ivan ; Matoušek, Vaclav - Plzeň : University of West Bohemia, 2011, 1-8

ISBN
987-80-261-0069-0

Skup
Text, Speech and Dialogue

Mjesto i datum
Plzeň, Češka Republika, 01.09.2011. - 05.09.2011

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
text normalization; non-standard words; text-to-speech

Sažetak
This paper presents text normalization which is an integral part of any text-to-speech synthesis system. Text normalization is a set of methods with a task to write non-standard words, like numbers, dates, times, abbreviations, acronyms and the most common symbols, in their full expanded form. The whole taxonomy for classification of non-standard words in Croatian language together with rule-based normalization methods combined with a lookup dictionary are proposed. Achieved token rate for normalization of Croatian texts is 95%, where 80% of expanded words are in correct morphological form.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti

Napomena
Student Section



POVEZANOST RADA


Projekti:
318-0361935-0852 - Govorne tehnologije (Ipšić, Ivo, MZOS ) ( CroRIS)

Ustanove:
Fakultet informatike i digitalnih tehnologija, Rijeka


Citiraj ovu publikaciju:

Beliga, Slobodan; Pobar, Miran; Martinčić-Ipšić, Sanda
Normalization of Non-Standard Words in Croatian Texts // Text, Speech and Dialogue extension to Lecture Notes in Artificial Intelligence LNAI6836 / Hebernal, Ivan ; Matoušek, Vaclav (ur.).
Plzeň: University of West Bohemia, 2011. str. 1-8 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Beliga, S., Pobar, M. & Martinčić-Ipšić, S. (2011) Normalization of Non-Standard Words in Croatian Texts. U: Hebernal, I. & Matoušek, V. (ur.)Text, Speech and Dialogue extension to Lecture Notes in Artificial Intelligence LNAI6836.
@article{article, author = {Beliga, Slobodan and Pobar, Miran and Martin\v{c}i\'{c}-Ip\v{s}i\'{c}, Sanda}, year = {2011}, pages = {1-8}, keywords = {text normalization, non-standard words, text-to-speech}, isbn = {987-80-261-0069-0}, title = {Normalization of Non-Standard Words in Croatian Texts}, keyword = {text normalization, non-standard words, text-to-speech}, publisher = {University of West Bohemia}, publisherplace = {Plze\v{n}, \v{C}e\v{s}ka Republika} }
@article{article, author = {Beliga, Slobodan and Pobar, Miran and Martin\v{c}i\'{c}-Ip\v{s}i\'{c}, Sanda}, year = {2011}, pages = {1-8}, keywords = {text normalization, non-standard words, text-to-speech}, isbn = {987-80-261-0069-0}, title = {Normalization of Non-Standard Words in Croatian Texts}, keyword = {text normalization, non-standard words, text-to-speech}, publisher = {University of West Bohemia}, publisherplace = {Plze\v{n}, \v{C}e\v{s}ka Republika} }




Contrast
Increase Font
Decrease Font
Dyslexic Font