Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 402071

Extending Lexical Association Measures for Collocation Extraction


Petrović, Saša; Šnajder, Jan; Dalbelo Bašić, Bojana
Extending Lexical Association Measures for Collocation Extraction // Computer Speech and Language, 24 (2010), 2; 383-394 doi:10.1016/j.csl.2009.06.001 (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 402071 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Extending Lexical Association Measures for Collocation Extraction

Autori
Petrović, Saša ; Šnajder, Jan ; Dalbelo Bašić, Bojana

Izvornik
Computer Speech and Language (0885-2308) 24 (2010), 2; 383-394

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
collocations; collocation extraction; lexical association measures

Sažetak
Collocations are linguistic phenomena that occur when two or more words appear together more often than by chance and whose meaning often cannot be inferred from the meanings of its parts. As collocations have found many applications in the fields of natural language processing, information retrieval, and text mining, extracting them from large corpora has been the focus of many studies over the past few years. In this paper, we introduce the notion of an extension pattern, a formalization of the idea of extending lexical association measures (AMs) defined for bigrams. An extension pattern provides a measure-independent way of extending AMs for extracting collocations of arbitrary length. We define different extension patterns and compare them on a task of extracting collocations from a newspaper corpus. We show that the stopword-sensitive extension patterns we propose outperform other extensions, which indicates that AMs could benefit by taking into account linguistic information about an n-grams's part-of-speech pattern.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo



POVEZANOST RADA


Projekti:
036-1300646-1986 - Otkrivanje znanja u tekstnim podacima (Dalbelo-Bašić, Bojana, MZO ) ( CroRIS)

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Avatar Url Jan Šnajder (autor)

Avatar Url Bojana Dalbelo Bašić (autor)

Poveznice na cjeloviti tekst rada:

doi www.sciencedirect.com

Citiraj ovu publikaciju:

Petrović, Saša; Šnajder, Jan; Dalbelo Bašić, Bojana
Extending Lexical Association Measures for Collocation Extraction // Computer Speech and Language, 24 (2010), 2; 383-394 doi:10.1016/j.csl.2009.06.001 (međunarodna recenzija, članak, znanstveni)
Petrović, S., Šnajder, J. & Dalbelo Bašić, B. (2010) Extending Lexical Association Measures for Collocation Extraction. Computer Speech and Language, 24 (2), 383-394 doi:10.1016/j.csl.2009.06.001.
@article{article, author = {Petrovi\'{c}, Sa\v{s}a and \v{S}najder, Jan and Dalbelo Ba\v{s}i\'{c}, Bojana}, year = {2010}, pages = {383-394}, DOI = {10.1016/j.csl.2009.06.001}, keywords = {collocations, collocation extraction, lexical association measures}, journal = {Computer Speech and Language}, doi = {10.1016/j.csl.2009.06.001}, volume = {24}, number = {2}, issn = {0885-2308}, title = {Extending Lexical Association Measures for Collocation Extraction}, keyword = {collocations, collocation extraction, lexical association measures} }
@article{article, author = {Petrovi\'{c}, Sa\v{s}a and \v{S}najder, Jan and Dalbelo Ba\v{s}i\'{c}, Bojana}, year = {2010}, pages = {383-394}, DOI = {10.1016/j.csl.2009.06.001}, keywords = {collocations, collocation extraction, lexical association measures}, journal = {Computer Speech and Language}, doi = {10.1016/j.csl.2009.06.001}, volume = {24}, number = {2}, issn = {0885-2308}, title = {Extending Lexical Association Measures for Collocation Extraction}, keyword = {collocations, collocation extraction, lexical association measures} }

Časopis indeksira:


  • Web of Science Core Collection (WoSCC)
    • Science Citation Index Expanded (SCI-EXP)
    • SCI-EXP, SSCI i/ili A&HCI
  • Scopus


Citati:





    Contrast
    Increase Font
    Decrease Font
    Dyslexic Font