Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 598891

Distributional Semantics Approach to Detecting Synonyms in Croatian Language


Karan, Mladen; Šnajder, Jan; Dalbelo Bašić, Bojana
Distributional Semantics Approach to Detecting Synonyms in Croatian Language // Proceedings of the Eighth Language Technologies Conference / Erjavec, Tomaž ; Žganec Gros, Jerneja (ur.).
Ljubljana, 2012. str. 111-116 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 598891 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Distributional Semantics Approach to Detecting Synonyms in Croatian Language

Autori
Karan, Mladen ; Šnajder, Jan ; Dalbelo Bašić, Bojana

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of the Eighth Language Technologies Conference / Erjavec, Tomaž ; Žganec Gros, Jerneja - Ljubljana, 2012, 111-116

Skup
Information Society 2012 - Eighth Language Technologies Conference

Mjesto i datum
Ljubljana, Slovenija, 08.10.2012. - 09.10.2012

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
Named Entities ; Extraction ; Classification

Sažetak
Identifying synonyms is important for many natural language processing and information retrieval applications. In this paper we address the task of automatically identifying synonyms in Croatian language using distributional semantic models (DSM). We build several DSMs using latent semantic analysis (LSA) and random indexing (RI) on the large hrWaC corpus. We evaluate the models on a dictionarybased similarity test – a set of synonymy questions generated automatically from a machine readable dictionary. Results indicate that LSA models outperform RI models on this task, with accuracy of 68.7%, 68.2%, and 61.6% on nouns, adjectives, and verbs, respectively. We analyze how word frequency and polysemy level affect the performance and discuss common causes of synonym misidentification.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo



POVEZANOST RADA


Projekti:
036-1300646-1986 - Otkrivanje znanja u tekstnim podacima (Dalbelo-Bašić, Bojana, MZO ) ( CroRIS)

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Avatar Url Jan Šnajder (autor)

Avatar Url Bojana Dalbelo Bašić (autor)

Avatar Url Mladen Karan (autor)

Citiraj ovu publikaciju:

Karan, Mladen; Šnajder, Jan; Dalbelo Bašić, Bojana
Distributional Semantics Approach to Detecting Synonyms in Croatian Language // Proceedings of the Eighth Language Technologies Conference / Erjavec, Tomaž ; Žganec Gros, Jerneja (ur.).
Ljubljana, 2012. str. 111-116 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Karan, M., Šnajder, J. & Dalbelo Bašić, B. (2012) Distributional Semantics Approach to Detecting Synonyms in Croatian Language. U: Erjavec, T. & Žganec Gros, J. (ur.)Proceedings of the Eighth Language Technologies Conference.
@article{article, author = {Karan, Mladen and \v{S}najder, Jan and Dalbelo Ba\v{s}i\'{c}, Bojana}, year = {2012}, pages = {111-116}, keywords = {Named Entities, Extraction, Classification}, title = {Distributional Semantics Approach to Detecting Synonyms in Croatian Language}, keyword = {Named Entities, Extraction, Classification}, publisherplace = {Ljubljana, Slovenija} }
@article{article, author = {Karan, Mladen and \v{S}najder, Jan and Dalbelo Ba\v{s}i\'{c}, Bojana}, year = {2012}, pages = {111-116}, keywords = {Named Entities, Extraction, Classification}, title = {Distributional Semantics Approach to Detecting Synonyms in Croatian Language}, keyword = {Named Entities, Extraction, Classification}, publisherplace = {Ljubljana, Slovenija} }




Contrast
Increase Font
Decrease Font
Dyslexic Font