Pregled bibliografske jedinice broj: 1253649
Towards a Reference Corpus of Web Genres for the Evaluation of Genre Identification Systems
Towards a Reference Corpus of Web Genres for the Evaluation of Genre Identification Systems // Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08) / Calzolari, Nicoletta ; Choukri, Khalid ; Maegaard, Bente ; Mariani, Joseph ; Odijk, Jan ; Piperidis, Stelios ; Tapias, Daniel (ur.).
Marakeš, Maroko, 2008. str. 351-358 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 1253649 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Towards a Reference Corpus of Web Genres for the Evaluation of Genre Identification Systems
Autori
Rehm, Georg ; Santini, Marina ; Mehler, Alexander ; Braslavski, Pavel ; Gleim, Rüdiger ; Stubbe, Andrea ; Symonenko, Svetlana ; Tavosanis, Mirko ; Vidulin, Vedrana
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
/ Calzolari, Nicoletta ; Choukri, Khalid ; Maegaard, Bente ; Mariani, Joseph ; Odijk, Jan ; Piperidis, Stelios ; Tapias, Daniel - , 2008, 351-358
ISBN
2-9517408-4-0
Skup
International Conference on Language Resources and Evaluation
Mjesto i datum
Marakeš, Maroko, 28.05.2008. - 30.05.2008
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
web genres, reference corpus
Sažetak
We present initial results from an international and multi-disciplinary research collaboration that aims at the construction of a reference corpus of web genres. The primary application scenario for which we plan to build this resource is the automatic identification of web genres. Web genres are rather difficult to capture and to describe in their entirety, but we plan for the finished reference corpus to contain multi-level tags of the respective genre or genres a web document or a website instantiates. As the construction of such a corpus is by no means a trivial task, we discuss several alternatives that are, for the time being, mostly based on existing collections. Furthermore, we discuss a shared set of genre categories and a multi-purpose tool as two additional prerequisites for a reference corpus of web genres.
Izvorni jezik
Engleski