Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 125424

Building the Croatian National Corpus


Tadić, Marko
Building the Croatian National Corpus // Third International Conference on Language Resources and Evaluation LREC2002 / González Rodriguez, M. ; Suarez Araujo, C. P. (ur.).
Pariz : Las Palmas de Gran Canaria: European Language Resources Association (ELRA), 2002. str. 441-446


CROSBI ID: 125424 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Building the Croatian National Corpus

Autori
Tadić, Marko

Vrsta, podvrsta i kategorija rada
Poglavlja u knjigama, znanstveni

Knjiga
Third International Conference on Language Resources and Evaluation LREC2002

Urednik/ci
González Rodriguez, M. ; Suarez Araujo, C. P.

Izdavač
European Language Resources Association (ELRA)

Grad
Pariz : Las Palmas de Gran Canaria

Godina
2002

Raspon stranica
441-446

ISBN
2-9517408-0-8

Ključne riječi
Croatian language, Corpus building, Croatian national corpus, Pos tagging

Sažetak
The paper presents the work being done so far on the building of the Croatian National Corpus (HNK). It's being collected since 1998 at the Institute of Linguistics, Faculty of Philosophy, University of Zagreb. The size, time-span, its composition and criteria for text selection are being presented. The HNK consists of two parts: 1) 30-million corpus of contemporary Croatian language, 2) Croatian Electronic Textual Archive. The procedures of the corpus mark-up and processing are being discussed. One of the most interesting features of this corpus since its launch in 1998 is its availability for querying through the WWW. The future directions of 30m corpus enlargement to 100m in next few years, enhanced corpus management and querying as well as annotation and processing are being discussed at the end.

Izvorni jezik
Engleski

Znanstvena područja
Filologija



POVEZANOST RADA


Projekti:
0130418

Ustanove:
Filozofski fakultet, Zagreb

Profili:

Avatar Url Marko Tadić (autor)

Citiraj ovu publikaciju:

Tadić, Marko
Building the Croatian National Corpus // Third International Conference on Language Resources and Evaluation LREC2002 / González Rodriguez, M. ; Suarez Araujo, C. P. (ur.).
Pariz : Las Palmas de Gran Canaria: European Language Resources Association (ELRA), 2002. str. 441-446
Tadić, M. (2002) Building the Croatian National Corpus. U: González Rodriguez, M. & Suarez Araujo, C. (ur.) Third International Conference on Language Resources and Evaluation LREC2002. Pariz : Las Palmas de Gran Canaria, European Language Resources Association (ELRA), str. 441-446.
@inbook{inbook, author = {Tadi\'{c}, Marko}, year = {2002}, pages = {441-446}, keywords = {Croatian language, Corpus building, Croatian national corpus, Pos tagging}, isbn = {2-9517408-0-8}, title = {Building the Croatian National Corpus}, keyword = {Croatian language, Corpus building, Croatian national corpus, Pos tagging}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Pariz : Las Palmas de Gran Canaria} }
@inbook{inbook, author = {Tadi\'{c}, Marko}, year = {2002}, pages = {441-446}, keywords = {Croatian language, Corpus building, Croatian national corpus, Pos tagging}, isbn = {2-9517408-0-8}, title = {Building the Croatian National Corpus}, keyword = {Croatian language, Corpus building, Croatian national corpus, Pos tagging}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Pariz : Las Palmas de Gran Canaria} }




Contrast
Increase Font
Decrease Font
Dyslexic Font