Pregled bibliografske jedinice broj: 584114
TakeLab: Systems for Measuring Semantic Text Similarity
TakeLab: Systems for Measuring Semantic Text Similarity // *SEM 2012: The First Joint Conference on Lexical and Computational Semantics / Agirre, Eneko ; Bos, Johan ; Diab, Mona ; Manandhar, Suresh ; Marton, Yuval ; Yuret, Deniz (ur.).
Montréal: Association for Computational Linguistics (ACL), 2012. str. 441-448 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 584114 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
TakeLab: Systems for Measuring Semantic Text Similarity
Autori
Šarić, Frane ; Glavaš, Goran ; Karan, Mladen ; Šnajder, Jan ; Dalbelo Bašić, Bojana
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics
/ Agirre, Eneko ; Bos, Johan ; Diab, Mona ; Manandhar, Suresh ; Marton, Yuval ; Yuret, Deniz - Montréal : Association for Computational Linguistics (ACL), 2012, 441-448
Skup
*SEM 2012: Joint Conference on Lexical and Computational Semantics
Mjesto i datum
Montréal, Kanada, 07.06.2012. - 08.06.2012
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
textual similarity ; semantic similarity ; machine learning
Sažetak
This paper describes the two systems for determining the semantic similarity of short texts submitted to the SemEval 2012 Task 6. Most of the research on semantic similarity of textual content focuses on large documents. However, a fair amount of information is condensed into short text snippets such as social media posts, image captions, and scientific abstracts. We predict the human ratings of sentence similarity using a support vector regression model with multiple features measuring word-overlap similarity and syntax similarity. Out of 89 systems submitted, our two systems ranked in the top 5, for the three overall evaluation metrics used (overall Pearson – 2nd and 3rd, normalized Pearson – 1st and 3rd, weighted mean – 2nd and 5th).
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Projekti:
036-1300646-1986 - Otkrivanje znanja u tekstnim podacima (Dalbelo-Bašić, Bojana, MZO ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb
Profili:
Bojana Dalbelo Bašić
(autor)
Goran Glavaš
(autor)
Frane Šarić
(autor)
Jan Šnajder
(autor)
Mladen Karan
(autor)