Pregled bibliografske jedinice broj: 1209197
Formalizing the Recognition of Medical Domain Multiword Units
Formalizing the Recognition of Medical Domain Multiword Units // Natural Language Processing in Healthcare: A Special Focus on Low Resource Languages / Dash, Satya Ranjan ; Parida, Shantipriya ; Tello, Esaú Villatoro ; Acharya, Biswaranjan ; Bojar, Ondřej (ur.).
Boca Raton (FL): CRC Press, 2022. str. 89-120 doi:10.1201/9781003138013-5
CROSBI ID: 1209197 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Formalizing the Recognition of Medical Domain
Multiword Units
Autori
Kocijan, Kristina ; Šojat, Krešimir
Vrsta, podvrsta i kategorija rada
Poglavlja u knjigama, znanstveni
Knjiga
Natural Language Processing in Healthcare: A Special Focus on Low Resource Languages
Urednik/ci
Dash, Satya Ranjan ; Parida, Shantipriya ; Tello, Esaú Villatoro ; Acharya, Biswaranjan ; Bojar, Ondřej
Izdavač
CRC Press
Grad
Boca Raton (FL)
Godina
2022
Raspon stranica
89-120
ISBN
9781003138013
Ključne riječi
multiword units, medical domain, low resource settings, digital medical lexicon, Croatian language, finite-state grammars, NooJ
Sažetak
The chapter deals with the recognition of medical domain multiword units (MWU) in texts written in Croatian language. The focus is on the automatic recognition of complex MWUs using low resource settings. These units are complex in terms that they consist of two or more noun or prepositional phrases, and include three different models such as ‘symptomatic treatment of patients’ [simptomatsko(A) liječenje(N NOM) bolesnika(N GEN)] or ‘herbal anti-asthmatic syrup’ [biljni(A) sirup(N NOM) protiv(PREP) kašlja(N GEN), as well as more complex ones, such as ‘continuous evaluation of the risk-benefit balance of the drug’ [kontinuirano(A) praćenje(NOUN) omjera(N GEN) koristi(N GEN) i(C) rizika(N GEN) lijeka(N GEN)]. Our method for the detection of MWUs is based on morpho-syntactic rules in the form of finite-state transducers that are used at the syntactic level of analysis. The algorithms we propose are designed within the NooJ platform with the main objective of automatic building of a medical lexicon extracted directly from a medical domain corpus. Such a digital lexicon will be valuable for further processing of medical texts and various NLP tasks in this domain including enhanced clinical analytics, text mining, and machine translation, at later stages of the project.
Izvorni jezik
Engleski
Znanstvena područja
Informacijske i komunikacijske znanosti, Interdisciplinarne društvene znanosti, Filologija, Interdisciplinarne humanističke znanosti
POVEZANOST RADA
Projekti:
FFZG--11-931-1047 - Obrada prirodnog jezika u domeni zdravstva (Kocijan, Kristina, FFZG ) ( CroRIS)
Ustanove:
Filozofski fakultet, Zagreb