Pregled bibliografske jedinice broj: 1089158
Building Croatian medical dictionary from medical corpus
Building Croatian medical dictionary from medical corpus // Rasprave Instituta za hrvatski jezik i jezikoslovlje, 46 (2020), 2; 765-782 doi:10.31724/rihjj.46.2.17 (međunarodna recenzija, prethodno priopćenje, znanstveni)
CROSBI ID: 1089158 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Building Croatian medical dictionary from medical
corpus
Autori
Kocijan, Kristina ; Kurolt, Silvia ; Mijić, Linda
Izvornik
Rasprave Instituta za hrvatski jezik i jezikoslovlje (1331-6745) 46
(2020), 2;
765-782
Vrsta, podvrsta i kategorija rada
Radovi u časopisima, prethodno priopćenje, znanstveni
Ključne riječi
language processing ; semantic annotations ; medical domain ; NooJ ; Croatian
Sažetak
The overall objective of this project is to define linguistic models at the lexical and syntactic levels that appear in the health domain, depending on the type of corpus. In the first phase of the project, the texts forming the medical corpus A – MedCorA (2, 232 pharmaceutical instructions for medicaments available in Croatia) were prepared. The terminology found in this corpus was analyzed and the semantic subdomains (anatomy, condition, microorganism, chemistry, etc.) within the medical domain were defined and added to the dictionary entries. These dictionary resources were used as the foundation for the second phase in which NooJ morphological grammars were built allowing annotation of medical terminology in the corpus. Said grammars were built to allow for recognizing Latinisms, as well as Latin expressions written with Croatian case endings, not only Croatian words. Prepared resources are made available to a broader scientific community via Sketch Engine for further research in the field of medicine enabling additional research and development of algorithms for, among others, medical documents classification, medical texts’ information retrieval or machine translation of medical documentation, taking into account quality and reliability as well as terminology variability.
Izvorni jezik
Engleski
Znanstvena područja
Informacijske i komunikacijske znanosti, Filologija
POVEZANOST RADA
Ustanove:
Filozofski fakultet, Zagreb,
Sveučilište u Zadru
Citiraj ovu publikaciju:
Časopis indeksira:
- Web of Science Core Collection (WoSCC)
- Emerging Sources Citation Index (ESCI)
- Scopus