Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 1061402

Corpus-Based Paraphrase Detection Experiments and Review


Vrbanec, Tedo; Meštrović, Ana
Corpus-Based Paraphrase Detection Experiments and Review // Information, 11 (2020), 5; 241, 24 doi:10.3390/info11050241 (međunarodna recenzija, pregledni rad, znanstveni)


CROSBI ID: 1061402 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Corpus-Based Paraphrase Detection Experiments and Review

Autori
Vrbanec, Tedo ; Meštrović, Ana

Izvornik
Information (2078-2489) 11 (2020), 5; 241, 24

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, pregledni rad, znanstveni

Ključne riječi
semantic similarity ; deep learning ; paraphrasing corpora ; experiments ; natural language processing

Sažetak
Paraphrase detection is important for a number of applications, including plagiarism detection, authorship attribution, question answering, text summarization, text mining in general, etc. In this paper, we give a performance overview of various types of corpus-based models, especially deep learning (DL) models, with the task of paraphrase detection. We report the results of eight models (LSI, TF-IDF, Word2Vec, Doc2Vec, GloVe, FastText, ELMO, and USE) evaluated on three different public available corpora: Microsoft Research Paraphrase Corpus, Clough and Stevenson and Webis Crowd Paraphrase Corpus 2011. Through a great number of experiments, we decided on the most appropriate approaches for text pre-processing: hyper-parameters, sub- model selection—where they exist (e.g., Skipgram vs. CBOW), distance measures, and semantic similarity/paraphrase detection threshold. Our findings and those of other researchers who have used deep learning models show that DL models are very competitive with traditional state-of-the- art approaches and have potential that should be further developed.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekti:
NadSve-Sveučilište u Rijeci-uniri-drustv-18-38 - Postupci mjerenja semantičke sličnosti tekstova (SemText) (Meštrović, Ana, NadSve - Natječaj za dodjelu sredstava potpore znanstvenim istraživanjima na Sveučilištu u Rijeci za 2018. godinu - projekti iskusnih znanstvenika i umjetnika) ( CroRIS)

Ustanove:
Učiteljski fakultet, Zagreb,
Fakultet informatike i digitalnih tehnologija, Rijeka

Profili:

Avatar Url Ana Meštrović (autor)

Avatar Url Tedo Vrbanec (autor)

Poveznice na cjeloviti tekst rada:

doi www.mdpi.com

Citiraj ovu publikaciju:

Vrbanec, Tedo; Meštrović, Ana
Corpus-Based Paraphrase Detection Experiments and Review // Information, 11 (2020), 5; 241, 24 doi:10.3390/info11050241 (međunarodna recenzija, pregledni rad, znanstveni)
Vrbanec, T. & Meštrović, A. (2020) Corpus-Based Paraphrase Detection Experiments and Review. Information, 11 (5), 241, 24 doi:10.3390/info11050241.
@article{article, author = {Vrbanec, Tedo and Me\v{s}trovi\'{c}, Ana}, year = {2020}, pages = {24}, DOI = {10.3390/info11050241}, chapter = {241}, keywords = {semantic similarity, deep learning, paraphrasing corpora, experiments, natural language processing}, journal = {Information}, doi = {10.3390/info11050241}, volume = {11}, number = {5}, issn = {2078-2489}, title = {Corpus-Based Paraphrase Detection Experiments and Review}, keyword = {semantic similarity, deep learning, paraphrasing corpora, experiments, natural language processing}, chapternumber = {241} }
@article{article, author = {Vrbanec, Tedo and Me\v{s}trovi\'{c}, Ana}, year = {2020}, pages = {24}, DOI = {10.3390/info11050241}, chapter = {241}, keywords = {semantic similarity, deep learning, paraphrasing corpora, experiments, natural language processing}, journal = {Information}, doi = {10.3390/info11050241}, volume = {11}, number = {5}, issn = {2078-2489}, title = {Corpus-Based Paraphrase Detection Experiments and Review}, keyword = {semantic similarity, deep learning, paraphrasing corpora, experiments, natural language processing}, chapternumber = {241} }

Časopis indeksira:


  • Web of Science Core Collection (WoSCC)
    • Emerging Sources Citation Index (ESCI)
  • Scopus


Uključenost u ostale bibliografske baze podataka::


  • EI Compendex


Citati:





    Contrast
    Increase Font
    Decrease Font
    Dyslexic Font