Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi

Formalizing the Recognition of Medical Domain Multiword Units (CROSBI ID 73731)

Prilog u knjizi | izvorni znanstveni rad | međunarodna recenzija

Kocijan, Kristina ; Šojat, Krešimir Formalizing the Recognition of Medical Domain Multiword Units // Natural Language Processing in Healthcare: A Special Focus on Low Resource Languages / Dash, Satya Ranjan ; Parida, Shantipriya ; Tello, Esaú Villatoro et al. (ur.). Boca Raton (FL): CRC Press, 2022. str. 89-120 doi: 10.1201/9781003138013-5

Podaci o odgovornosti

Kocijan, Kristina ; Šojat, Krešimir

engleski

Formalizing the Recognition of Medical Domain Multiword Units

The chapter deals with the recognition of medical domain multiword units (MWU) in texts written in Croatian language. The focus is on the automatic recognition of complex MWUs using low resource settings. These units are complex in terms that they consist of two or more noun or prepositional phrases, and include three different models such as ‘symptomatic treatment of patients’ [simptomatsko(A) liječenje(N NOM) bolesnika(N GEN)] or ‘herbal anti-asthmatic syrup’ [biljni(A) sirup(N NOM) protiv(PREP) kašlja(N GEN), as well as more complex ones, such as ‘continuous evaluation of the risk-benefit balance of the drug’ [kontinuirano(A) praćenje(NOUN) omjera(N GEN) koristi(N GEN) i(C) rizika(N GEN) lijeka(N GEN)]. Our method for the detection of MWUs is based on morpho-syntactic rules in the form of finite-state transducers that are used at the syntactic level of analysis. The algorithms we propose are designed within the NooJ platform with the main objective of automatic building of a medical lexicon extracted directly from a medical domain corpus. Such a digital lexicon will be valuable for further processing of medical texts and various NLP tasks in this domain including enhanced clinical analytics, text mining, and machine translation, at later stages of the project.

multiword units, medical domain, low resource settings, digital medical lexicon, Croatian language, finite-state grammars, NooJ

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

89-120.

objavljeno

10.1201/9781003138013-5

Podaci o knjizi

Natural Language Processing in Healthcare: A Special Focus on Low Resource Languages

Dash, Satya Ranjan ; Parida, Shantipriya ; Tello, Esaú Villatoro ; Acharya, Biswaranjan ; Bojar, Ondřej

Boca Raton (FL): CRC Press

2022.

9781003138013

Povezanost rada

Filologija, Informacijske i komunikacijske znanosti, Interdisciplinarne društvene znanosti, Interdisciplinarne humanističke znanosti

Poveznice