Pregled bibliografske jedinice broj: 849948
Recognizing verb-based Croatian idiomatic MWUs
Recognizing verb-based Croatian idiomatic MWUs // Automatic processing of natural-language electronic texts with NooJ : revised selected papers / Okrut, Tatsiana ; Hetsevich, Yuras ; Silberztein, Max ; Stanislavenka, Hanna (ur.).
Minsk, Bjelorusija: Springer, 2016. str. 96-106 (predavanje, recenziran, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 849948 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Recognizing verb-based Croatian idiomatic MWUs
Autori
Kocijan, Kristina ; Librenjak, Sara
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Automatic processing of natural-language electronic texts with NooJ : revised selected papers
/ Okrut, Tatsiana ; Hetsevich, Yuras ; Silberztein, Max ; Stanislavenka, Hanna - : Springer, 2016, 96-106
ISBN
978-3-319-42470-5
Skup
International conference on automatic processing of natural-language electronic texts with NooJ - NooJ2015
Mjesto i datum
Minsk, Bjelorusija, 11.06.2015. - 13.06.2015
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Recenziran
Ključne riječi
Croatian, idioms, verbal phrases, NooJ, MWU, MWE, frozen expressions, semi-frozen expressions
Sažetak
This paper tackles the computational problems of Croatian verbal idioms. Croatian language has very rich phraseme structure, as described in Matešić (1982), Menac (2007) and Menac- Mihalić (2007), as well as many others. This work is one of the few attempts of computational analyis of idioms in Croatian language as multi-word units. We used rule- based approach and NooJ syntactic grammars in order to recognize any verb based idiom (of the ~1500 analyzed) in any syntactic position. The Croatian Dictionary of Idioms (Menac et al. 2003) was used for the initial list, which was implemented with new additions during training phase. Grammars were tested within the corpora constructed specifically for this work, and used to calculate statistical measures of recall, precision and f-measure for our grammars. With the final results of recall < 98 %, precision < 96 % and f-measure < 97 %, we consider this a successful attempt in the recognition of verb based idioms in Croatian language.
Izvorni jezik
Engleski
Znanstvena područja
Informacijske i komunikacijske znanosti, Filologija
Citiraj ovu publikaciju:
Časopis indeksira:
- Scopus