Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 1155753

Characterisation of COVID-19-Related Tweets in the Croatian Language: Framework Based on the Cro-CoV-cseBERT Model


Babić, Karlo; Petrović, Milan; Beliga, Slobodan; Martinčić-Ipšić, Sanda; Matešić, Mihaela; Meštrović, Ana
Characterisation of COVID-19-Related Tweets in the Croatian Language: Framework Based on the Cro-CoV-cseBERT Model // Applied Sciences-Basel, 11 (2021), 21; 10442, 22 doi:10.3390/app112110442 (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 1155753 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Characterisation of COVID-19-Related Tweets in the Croatian Language: Framework Based on the Cro-CoV-cseBERT Model

Autori
Babić, Karlo ; Petrović, Milan ; Beliga, Slobodan ; Martinčić-Ipšić, Sanda ; Matešić, Mihaela ; Meštrović, Ana

Izvornik
Applied Sciences-Basel (2076-3417) 11 (2021), 21; 10442, 22

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
sentiment analysis ; clustering ; BERT model ; natural language processing ; COVID-19 ; Twitter data ; social media

Sažetak
This study aims to provide insights into the COVID-19-related communication on Twitter in the Republic of Croatia. For that purpose, we developed an NL-based framework that enables automatic analysis of a large dataset of tweets in the Croatian language. We collected and analysed 206, 196 tweets related to COVID-19 and constructed a dataset of 10, 000 tweets which we manually annotated with a sentiment label. We trained the Cro-CoV-cseBERT language model for the representation and clustering of tweets. Additionally, we compared the performance of four machine learning algorithms on the task of sentiment classification. After identifying the best performing setup of NLP methods, we applied the proposed framework in the task of characterisation of COVID-19 tweets in Croatia. More precisely, we performed sentiment analysis and tracked the sentiment over time. Furthermore, we detected how tweets are grouped into clusters with similar themes across three pandemic waves. Additionally, we characterised the tweets by analysing the distribution of sentiment polarity (in each thematic cluster and over time) and the number of retweets (in each thematic cluster and sentiment class). These results could be useful for additional research and interpretation in the domains of sociology, psychology or other sciences, as well as for the authorities, who could use them to address crisis communication problems.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti, Interdisciplinarne društvene znanosti



POVEZANOST RADA


Projekti:
HRZZ-IP-CORONA-2020-04-2061 - Višeslojni okvir za karakterizaciju širenja informacija putem društvenih medija tijekom krize COVID-19 (InfoCoV) (Meštrović, Ana, HRZZ - 2020-04) ( CroRIS)
NadSve-Sveučilište u Rijeci-uniri-drustv-18-38 - Postupci mjerenja semantičke sličnosti tekstova (SemText) (Meštrović, Ana, NadSve - Natječaj za dodjelu sredstava potpore znanstvenim istraživanjima na Sveučilištu u Rijeci za 2018. godinu - projekti iskusnih znanstvenika i umjetnika) ( CroRIS)

Ustanove:
Filozofski fakultet, Rijeka,
Fakultet informatike i digitalnih tehnologija, Rijeka

Poveznice na cjeloviti tekst rada:

Pristup cjelovitom tekstu rada doi www.mdpi.com

Citiraj ovu publikaciju:

Babić, Karlo; Petrović, Milan; Beliga, Slobodan; Martinčić-Ipšić, Sanda; Matešić, Mihaela; Meštrović, Ana
Characterisation of COVID-19-Related Tweets in the Croatian Language: Framework Based on the Cro-CoV-cseBERT Model // Applied Sciences-Basel, 11 (2021), 21; 10442, 22 doi:10.3390/app112110442 (međunarodna recenzija, članak, znanstveni)
Babić, K., Petrović, M., Beliga, S., Martinčić-Ipšić, S., Matešić, M. & Meštrović, A. (2021) Characterisation of COVID-19-Related Tweets in the Croatian Language: Framework Based on the Cro-CoV-cseBERT Model. Applied Sciences-Basel, 11 (21), 10442, 22 doi:10.3390/app112110442.
@article{article, author = {Babi\'{c}, Karlo and Petrovi\'{c}, Milan and Beliga, Slobodan and Martin\v{c}i\'{c}-Ip\v{s}i\'{c}, Sanda and Mate\v{s}i\'{c}, Mihaela and Me\v{s}trovi\'{c}, Ana}, year = {2021}, pages = {22}, DOI = {10.3390/app112110442}, chapter = {10442}, keywords = {sentiment analysis, clustering, BERT model, natural language processing, COVID-19, Twitter data, social media}, journal = {Applied Sciences-Basel}, doi = {10.3390/app112110442}, volume = {11}, number = {21}, issn = {2076-3417}, title = {Characterisation of COVID-19-Related Tweets in the Croatian Language: Framework Based on the Cro-CoV-cseBERT Model}, keyword = {sentiment analysis, clustering, BERT model, natural language processing, COVID-19, Twitter data, social media}, chapternumber = {10442} }
@article{article, author = {Babi\'{c}, Karlo and Petrovi\'{c}, Milan and Beliga, Slobodan and Martin\v{c}i\'{c}-Ip\v{s}i\'{c}, Sanda and Mate\v{s}i\'{c}, Mihaela and Me\v{s}trovi\'{c}, Ana}, year = {2021}, pages = {22}, DOI = {10.3390/app112110442}, chapter = {10442}, keywords = {sentiment analysis, clustering, BERT model, natural language processing, COVID-19, Twitter data, social media}, journal = {Applied Sciences-Basel}, doi = {10.3390/app112110442}, volume = {11}, number = {21}, issn = {2076-3417}, title = {Characterisation of COVID-19-Related Tweets in the Croatian Language: Framework Based on the Cro-CoV-cseBERT Model}, keyword = {sentiment analysis, clustering, BERT model, natural language processing, COVID-19, Twitter data, social media}, chapternumber = {10442} }

Časopis indeksira:


  • Current Contents Connect (CCC)
  • Web of Science Core Collection (WoSCC)
    • Science Citation Index Expanded (SCI-EXP)
    • SCI-EXP, SSCI i/ili A&HCI
  • Scopus


Citati:





    Contrast
    Increase Font
    Decrease Font
    Dyslexic Font