Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 930470

Merging Comparable Data Sources for the Discrimination of Similar Languages: The DSL Corpus Collection


Tan, Liling; Zampieri. Marcos; Ljubešić, Nikola; Tiedemann, Jörg
Merging Comparable Data Sources for the Discrimination of Similar Languages: The DSL Corpus Collection // Proceedings of the 7th Workshop on Building and Using Comparable Corpora (BUCC)
Reykjavík: European Language Resources Association (ELRA), 2014. str. 20-24 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 930470 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Merging Comparable Data Sources for the Discrimination of Similar Languages: The DSL Corpus Collection

Autori
Tan, Liling ; Zampieri. Marcos ; Ljubešić, Nikola ; Tiedemann, Jörg

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of the 7th Workshop on Building and Using Comparable Corpora (BUCC) / - Reykjavík : European Language Resources Association (ELRA), 2014, 20-24

Skup
7th Workshop on Building and Using Comparable Corpora (BUCC)

Mjesto i datum
Reykjavík, Island, 27.05.2014

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
comparable corpora, similar languages, language discrimination

Sažetak
This paper presents the compilation of the DSL corpus collection created for the DSL (Discriminating Similar Languages) shared task to be held at the VarDial workshop at COLING 2014. The DSL corpus collection were merged from three comparable corpora to provide a suitable dataset for automatic classification to discriminate similar languages and language varieties. Along with the description of the DSL corpus collection we also present results of baseline discrimination experiments reporting performance of up to 87.4% accuracy.

Izvorni jezik
Engleski



POVEZANOST RADA


Ustanove:
Filozofski fakultet, Zagreb

Profili:

Avatar Url Nikola Ljubešić (autor)


Citiraj ovu publikaciju:

Tan, Liling; Zampieri. Marcos; Ljubešić, Nikola; Tiedemann, Jörg
Merging Comparable Data Sources for the Discrimination of Similar Languages: The DSL Corpus Collection // Proceedings of the 7th Workshop on Building and Using Comparable Corpora (BUCC)
Reykjavík: European Language Resources Association (ELRA), 2014. str. 20-24 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Tan, L., Zampieri. Marcos, Ljubešić, N. & Tiedemann, J. (2014) Merging Comparable Data Sources for the Discrimination of Similar Languages: The DSL Corpus Collection. U: Proceedings of the 7th Workshop on Building and Using Comparable Corpora (BUCC).
@article{article, author = {Tan, Liling and Ljube\v{s}i\'{c}, Nikola and Tiedemann, J\"{o}rg}, year = {2014}, pages = {20-24}, keywords = {comparable corpora, similar languages, language discrimination}, title = {Merging Comparable Data Sources for the Discrimination of Similar Languages: The DSL Corpus Collection}, keyword = {comparable corpora, similar languages, language discrimination}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Reykjav\'{\i}k, Island} }
@article{article, author = {Tan, Liling and Ljube\v{s}i\'{c}, Nikola and Tiedemann, J\"{o}rg}, year = {2014}, pages = {20-24}, keywords = {comparable corpora, similar languages, language discrimination}, title = {Merging Comparable Data Sources for the Discrimination of Similar Languages: The DSL Corpus Collection}, keyword = {comparable corpora, similar languages, language discrimination}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Reykjav\'{\i}k, Island} }

Časopis indeksira:


  • Scopus





Contrast
Increase Font
Decrease Font
Dyslexic Font