Textual Analysis of Good Practice Requirements of EuroRec Repository Statements

Hercigonja-Szekeres, Mira; Ilakovac, Vesna

izvor podataka: crosbi !

Textual Analysis of Good Practice Requirements of EuroRec Repository Statements (CROSBI ID 567539)

Prilog sa skupa u zborniku | sažetak izlaganja sa skupa | međunarodna recenzija

Hercigonja-Szekeres, Mira ; Ilakovac, Vesna Textual Analysis of Good Practice Requirements of EuroRec Repository Statements // EMMIT 2010 EURO-MEDITERRANEAN MEDICAL INFORMATICS and TELEMEDICINE 6th International Conference - Book of Abstracts / Sicurello, Francesco ; Ilakovac, Vesna ; Đogaš, Zoran (ur.). Split, 2010. str. 50-50

Podaci o odgovornosti

Autori

Hercigonja-Szekeres, Mira ; Ilakovac, Vesna

Osnovni podaci na izvornom jeziku
Osnovni podaci na ostalim jezicima

Jezik

engleski

Naslov

Textual Analysis of Good Practice Requirements of EuroRec Repository Statements

Sažetak

Objective and design. The generic and comprehensive repository of statements is composed by the European Institute for Health Records (EuroRec) to describe the use of functions, structuring, and data elements of Electronic Health Record (EHR) systems. The project EHR-Q-TN enables the repository and tools to be accessible to partners from European countries to establish seamless cross-border multilingual description and validation or certification of EHR systems and use functions. The repository statements are grouped in two interlinked sets: Fine Grained Statements (FGS) and Good Practice Requirements (GPR). In the process of translating repository statements into national languages two threats to consistency and coherency of translations were identified. Some of the English words or phrases do not exist in other languages and substitutes in different national languages might alter the meaning of the original statement. The other problem is consistency of translations within the same language. The aim of this study was to provide support to creation of coherent multilingual dictionary by identifying most frequent words and segments within GPR repository statements using statistical textual analysis. Methods. Text corpus comprised 178 GPR statements. We performed lexicometric analysis, analysis of repeated segments and text concordance analysis. For the purpose of the analysis we excluded articles (a, an, the) from the text corpus and we jointed auxiliary verb and “not” in the negative form of the verb. No other change was done in the text. French software Dtm-Vic (Data and Text Mining – Visualization, Inference, Classification) was used for the analyses. Results. There were 4990 words in total in the analyzed text corpus of 178 GPR statements. Number of distinct words was 1053 (21.1%). Among 20 most frequent words (frequency over 40) there were 13 (65%) meaningful words such as “system”, “enables”, “user”, “medicinal”, “medication”, “product”, “data”, “health”, “prescription”, “well”, “be”, “item”, “patient”. The word "system" was the most frequent word with 209 occurrences in the text corpus. In the analysis of repeated segments we limited segments to the length of 3 words because we expected that segments of 2 and 3 words would give meaningful units suitable for translation. "System enables" and "medicinal product" were two most frequent meaningful two words segments, followed by "health item" and "enables user". Ten most frequent two words segments (5.4% of total number of extracted segments) comprised 550 (25.4%) of total 2166 words extracted in 186 segments. Text concordance analysis extracted multipart segments which are grouped with/around some words forming long segments suitable for direct translation, such as “system enables user to”. Conclusions. Statistical textual analysis might be a useful tool to bridge a gap in multilingual environment of the process of unified EHR system quality assessment. By combining lexicometric analysis, analysis of repeated segments and text concordance analysis it is possible to easily identify words or segments which have the greatest weight in the text corpus of repository statements. Using these words and segments as the basis of translation and their inclusion in a multilingual dictionary would enable consistent and coherent translation of majority of repository statements.

Ključne riječi

textual analysis; consistency of translations; EHR system; unified quality assessment

Napomena

nije evidentirano

Jezik

nije evidentirano

Naslov

nije evidentirano

Sažetak

nije evidentirano

Ključne riječi

nije evidentirano

Napomena

nije evidentirano

Podaci o prilogu

Stranice rada

50-50.

Godina izdavanja

2010.

Status objave rada

objavljeno

Podaci o matičnoj publikaciji

Naslov

EMMIT 2010 EURO-MEDITERRANEAN MEDICAL INFORMATICS and TELEMEDICINE 6th International Conference - Book of Abstracts

Urednici

Sicurello, Francesco ; Ilakovac, Vesna ; Đogaš, Zoran

Izdavač

Split:

Podaci o skupu

Skup

EMMIT 2010 EURO-MEDITERRANEAN MEDICAL INFORMATICS and TELEMEDICINE 6th International Conference

Vrsta sudjelovanja

predavanje

Datum održavanja skupa

26.09.2010-28.09.2010

Mjesto održavanja skupa

Split, Hrvatska

Povezanost rada

Povezane osobe

Mira Hercigonja Szekeres (autor/i)

Vesna Ilakovac (autor/i)

Povezane ustanove

Medicinski fakultet Osijek (219) (autorova ustanova)

Povezani projekti

Valjanost podataka objavljenih u znanstvenom časopisu (rezultat rada na projektu)

Područje

Javno zdravstvo i zdravstvena zaštita