Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 326450

Automatic Acquisition of Inflectional Lexica for Morphological Normalisation


Šnajder, Jan; Dalbelo Bašić, Bojana; Tadić, Marko
Automatic Acquisition of Inflectional Lexica for Morphological Normalisation // Information Processing & Management, 44 (2008), 5; 1720-1731 doi::10.1016/j.ipm.2008.03.006 (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 326450 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Automatic Acquisition of Inflectional Lexica for Morphological Normalisation

Autori
Šnajder, Jan ; Dalbelo Bašić, Bojana ; Tadić, Marko

Izvornik
Information Processing & Management (0306-4573) 44 (2008), 5; 1720-1731

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
Morphological normalisation; morphological lexicon; lexicon acquisition; inflection; Croatian language; text mining; information retrieval

Sažetak
Due to natural language morphology, words can take on various morphological forms. Morphological normalisation – often used in information retrieval and text mining systems – conflates morphological variants of a word to a single representative form. In this paper, we describe an approach to lexicon-based inflectional normalisation. This approach is in between stemming and lemmatisation, and is suitable for morphological normalisation of inflectionally complex languages. To eliminate the immense effort required to compile the lexicon by hand, we focus on the problem of acquiring automatically an inflectional morphological lexicon from raw corpora. We propose a convenient and highly expressive morphology representation formalism on which the acquisition procedure is based. Our approach is applied to the morphologically complex Croatian language, but it should be equally applicable to other languages of similar morphological complexity. Experimental results show that our approach can be used to acquire a lexicon whose linguistic quality allows for rather good normalisation performance.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti, Filologija



POVEZANOST RADA


Projekti:
036-1300646-1986 - Otkrivanje znanja u tekstnim podacima (Dalbelo-Bašić, Bojana, MZO ) ( CroRIS)
130-1300646-0645 - Hrvatski jezični resursi i njihovo obilježavanje (Tadić, Marko, MZOS ) ( CroRIS)

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb,
Filozofski fakultet, Zagreb

Profili:

Avatar Url Jan Šnajder (autor)

Avatar Url Bojana Dalbelo Bašić (autor)

Avatar Url Marko Tadić (autor)

Poveznice na cjeloviti tekst rada:

doi dx.doi.org

Citiraj ovu publikaciju:

Šnajder, Jan; Dalbelo Bašić, Bojana; Tadić, Marko
Automatic Acquisition of Inflectional Lexica for Morphological Normalisation // Information Processing & Management, 44 (2008), 5; 1720-1731 doi::10.1016/j.ipm.2008.03.006 (međunarodna recenzija, članak, znanstveni)
Šnajder, J., Dalbelo Bašić, B. & Tadić, M. (2008) Automatic Acquisition of Inflectional Lexica for Morphological Normalisation. Information Processing & Management, 44 (5), 1720-1731 doi::10.1016/j.ipm.2008.03.006.
@article{article, author = {\v{S}najder, Jan and Dalbelo Ba\v{s}i\'{c}, Bojana and Tadi\'{c}, Marko}, year = {2008}, pages = {1720-1731}, DOI = {doi:10.1016/j.ipm.2008.03.006}, keywords = {Morphological normalisation, morphological lexicon, lexicon acquisition, inflection, Croatian language, text mining, information retrieval}, journal = {Information Processing and Management}, doi = {doi:10.1016/j.ipm.2008.03.006}, volume = {44}, number = {5}, issn = {0306-4573}, title = {Automatic Acquisition of Inflectional Lexica for Morphological Normalisation}, keyword = {Morphological normalisation, morphological lexicon, lexicon acquisition, inflection, Croatian language, text mining, information retrieval} }
@article{article, author = {\v{S}najder, Jan and Dalbelo Ba\v{s}i\'{c}, Bojana and Tadi\'{c}, Marko}, year = {2008}, pages = {1720-1731}, DOI = {doi:10.1016/j.ipm.2008.03.006}, keywords = {Morphological normalisation, morphological lexicon, lexicon acquisition, inflection, Croatian language, text mining, information retrieval}, journal = {Information Processing and Management}, doi = {doi:10.1016/j.ipm.2008.03.006}, volume = {44}, number = {5}, issn = {0306-4573}, title = {Automatic Acquisition of Inflectional Lexica for Morphological Normalisation}, keyword = {Morphological normalisation, morphological lexicon, lexicon acquisition, inflection, Croatian language, text mining, information retrieval} }

Časopis indeksira:


  • Current Contents Connect (CCC)
  • Web of Science Core Collection (WoSCC)
    • Science Citation Index Expanded (SCI-EXP)
    • Social Science Citation Index (SSCI)
    • SCI-EXP, SSCI i/ili A&HCI
  • Scopus


Uključenost u ostale bibliografske baze podataka::


  • Compu-Math Citation Index
  • Information Science Abstracts
  • LISA: Library and Information Science Abstracts
  • PIRA (Packaging, Paper, Printing and Publishing, Imaging and Nonwovens Abstracts)
  • PsychINFO
  • Zentrallblatt für Mathematik/Mathematical Abstracts


Citati:





    Contrast
    Increase Font
    Decrease Font
    Dyslexic Font