Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 950542

Document-based Topic Coherence Measures for News Media Text


Korenčić, Damir; Ristov, Strahil; Šnajder, Jan
Document-based Topic Coherence Measures for News Media Text // Expert systems with applications, 114 (2018), 357-373 doi:10.1016/j.eswa.2018.07.063 (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 950542 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Document-based Topic Coherence Measures for News Media Text

Autori
Korenčić, Damir ; Ristov, Strahil ; Šnajder, Jan

Izvornik
Expert systems with applications (0957-4174) 114 (2018); 357-373

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
topic models ; topic coherence ; topic model evaluation ; text analysis ; news text ; exploratory analysis

Sažetak
There is a rising need for automated analysis of news text, and topic models have proven to be useful tools for this task. However, as the quality of the topics induced by topic models greatly varies, much research effort has been devoted to their automated evaluation. Recent research has focused on topic coherence as a measure of a topic’s quality. Existing topic coherence measures work by considering the semantic similarity of topic words. This makes them unfit to detect the coherence of transient topics with semantically unrelated topic words, which abound in news media texts. In this paper, we introduce the notion of document-based topic coherence and propose novel topic coherence measures that estimate topic coherence based on topic documents rather than topic words. We evaluate the proposed measures on two datasets containing topics manually labeled for document-based coherence, on which the proposed measures outperform a strong baseline as well as word-based coherence measures. We also demonstrate the usefulness of document-based coherence measures for automated topic discovery from news media texts.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo



POVEZANOST RADA


Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb,
Institut "Ruđer Bošković", Zagreb

Profili:

Avatar Url Damir Korenčić (autor)

Avatar Url Jan Šnajder (autor)

Avatar Url Strahil Ristov (autor)

Citiraj ovu publikaciju:

Korenčić, Damir; Ristov, Strahil; Šnajder, Jan
Document-based Topic Coherence Measures for News Media Text // Expert systems with applications, 114 (2018), 357-373 doi:10.1016/j.eswa.2018.07.063 (međunarodna recenzija, članak, znanstveni)
Korenčić, D., Ristov, S. & Šnajder, J. (2018) Document-based Topic Coherence Measures for News Media Text. Expert systems with applications, 114, 357-373 doi:10.1016/j.eswa.2018.07.063.
@article{article, author = {Koren\v{c}i\'{c}, Damir and Ristov, Strahil and \v{S}najder, Jan}, year = {2018}, pages = {357-373}, DOI = {10.1016/j.eswa.2018.07.063}, keywords = {topic models, topic coherence, topic model evaluation, text analysis, news text, exploratory analysis}, journal = {Expert systems with applications}, doi = {10.1016/j.eswa.2018.07.063}, volume = {114}, issn = {0957-4174}, title = {Document-based Topic Coherence Measures for News Media Text}, keyword = {topic models, topic coherence, topic model evaluation, text analysis, news text, exploratory analysis} }
@article{article, author = {Koren\v{c}i\'{c}, Damir and Ristov, Strahil and \v{S}najder, Jan}, year = {2018}, pages = {357-373}, DOI = {10.1016/j.eswa.2018.07.063}, keywords = {topic models, topic coherence, topic model evaluation, text analysis, news text, exploratory analysis}, journal = {Expert systems with applications}, doi = {10.1016/j.eswa.2018.07.063}, volume = {114}, issn = {0957-4174}, title = {Document-based Topic Coherence Measures for News Media Text}, keyword = {topic models, topic coherence, topic model evaluation, text analysis, news text, exploratory analysis} }

Časopis indeksira:


  • Current Contents Connect (CCC)
  • Web of Science Core Collection (WoSCC)
    • Science Citation Index Expanded (SCI-EXP)
    • SCI-EXP, SSCI i/ili A&HCI
  • Scopus


Citati:





    Contrast
    Increase Font
    Decrease Font
    Dyslexic Font