Frequently Asked Questions Retrieval for Croatian Based on Semantic Textual Similarity (CROSBI ID 597898)
Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Karan, Mladen ; Žmak, Lovro ; Šnajder, Jan
engleski
Frequently Asked Questions Retrieval for Croatian Based on Semantic Textual Similarity
Frequently asked questions (FAQ) are an efficient way of communicating domain-specific information to the users. Unlike general purpose retrieval engines, FAQ retrieval engines have to address the lexical gap between the query and the usually short answer. In this paper we describe the design and evaluation of a FAQ retrieval engine for Croatian. We frame the task as a binary classification problem, and train a model to classify each FAQ as either relevant or not relevant for a given query. We use a variety of semantic textual similarity features, including term overlap and vector space features. We train and evaluate on a FAQ test collection built specifically for this purpose. Our best-performing model reaches 0.47 of mean reciprocal rank, i.e., on average ranks the relevant answer among the top two returned answers.
FAQ retrieval ; information retrieval ; semantic textual similarity ; Croatian language
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
24-33.
2013.
objavljeno
Podaci o matičnoj publikaciji
Procedings of the 4th Biennial International Workshop on Balto-Slavic Natural Language Processing
Sofija: Association for Computational Linguistics (ACL)
Podaci o skupu
4th Biennial International Workshop on Balto-Slavic Natural Language Processing
predavanje
08.08.2013-09.08.2013
Sofija, Bugarska