Napredna pretraga

Pregled bibliografske jedinice broj: 656949

A Comparative Evaluation of Cross-Lingual Text Annotation Techniques


Zhang, Lei; Rettinger, Achim; Färber, Michael; Tadić, Marko
A Comparative Evaluation of Cross-Lingual Text Annotation Techniques // Information Access Evaluation. Multilinguality, Multimodality, and Visualization / Forner, Pamela ; Müller, Henning ; Paredes, Roberto ; Rosso, Paolo ; Stein, Benno (ur.).
Berlin-Heidelberg: Springer, 2013. str. 124-135


Naslov
A Comparative Evaluation of Cross-Lingual Text Annotation Techniques

Autori
Zhang, Lei ; Rettinger, Achim ; Färber, Michael ; Tadić, Marko

Vrsta, podvrsta i kategorija rada
Poglavlja u knjigama, znanstveni

Knjiga
Information Access Evaluation. Multilinguality, Multimodality, and Visualization

Urednik/ci
Forner, Pamela ; Müller, Henning ; Paredes, Roberto ; Rosso, Paolo ; Stein, Benno

Izdavač
Springer

Grad
Berlin-Heidelberg

Godina
2013

Raspon stranica
124-135

ISBN
978-3-642-40801-4

Ključne riječi
Cross-lingual annotation, knowledge extraction, parallel corpus, language technologies, knowledge technologies

Sažetak
In this paper, we study the problem of extracting knowledge from textual documents written in di erent languages by annotating the text on the basis of a cross-lingual knowledge base, namely Wikipedia. Our contribution is twofold. First, we propose a novel framework for evaluating cross-lingual text annotation techniques, based on annotation of a parallel corpus to a hub-language in a cross-lingual knowledge base. Second, we investigate the performance of di erent cross-lingual text annotation techniques according to our proposed evaluation framework. We perform experiments for an empirical comparison of three approaches: (i) Cross-lingual Named Entity Annotation (CL-NEA), (ii) Cross-lingual Wiki er Annotation (CL-WIFI), and (iii) Cross-lingual Explicit Semantic Analysis (CL-ESA). Besides establishing an evaluation framework, our results show the advantages and disadvantages of the three investigated approaches and clarify the roles of them for di erent purposes.

Izvorni jezik
Engleski

Znanstvena područja
Informacijske i komunikacijske znanosti, Filologija



POVEZANOST RADA


Projekt / tema
130-1300646-0645 - Hrvatski jezični resursi i njihovo obilježavanje (Marko Tadić, )

Ustanove
Filozofski fakultet, Zagreb

Autor s matičnim brojem:
Marko Tadić, (157043)