Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 1253068

Multi-label approaches to web genre identification


Vidulin, Vedrana; Luštrek, Mitja; Gams, Matjaž
Multi-label approaches to web genre identification // Journal for language technology and computational linguistics, 24 (2009), 1; 93-110 (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 1253068 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Multi-label approaches to web genre identification

Autori
Vidulin, Vedrana ; Luštrek, Mitja ; Gams, Matjaž

Izvornik
Journal for language technology and computational linguistics (0175-1336) 24 (2009), 1; 93-110

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
web genre classification, multi-label classification

Sažetak
A web page is a complex document which can share conventions of several genres, or contain several parts, each belonging to a different genre. To properly address the genre interplay, a recent proposal in automatic web genre identification is multi-label classification. The dominant approach to such classification is to transform one multi-label machine learning problem into several sub-problems of learning binary single-label classifiers, one foreach genre. In this paper we explore multi-class transformation, where each combination of genres is labeled with a single distinct label. This approach is then compared to the binary approach to determine which one better captures the multi-label aspect of web genres. Experimental results show that both of the approaches failed to properly address multi-genre web pages. Obtained differences were a result of the variations in the recognition of one-genre web pages.

Izvorni jezik
Engleski



POVEZANOST RADA


Profili:

Avatar Url Vedrana Vidulin (autor)

Poveznice na cjeloviti tekst rada:

jlcl.org

Citiraj ovu publikaciju:

Vidulin, Vedrana; Luštrek, Mitja; Gams, Matjaž
Multi-label approaches to web genre identification // Journal for language technology and computational linguistics, 24 (2009), 1; 93-110 (međunarodna recenzija, članak, znanstveni)
Vidulin, V., Luštrek, M. & Gams, M. (2009) Multi-label approaches to web genre identification. Journal for language technology and computational linguistics, 24 (1), 93-110.
@article{article, author = {Vidulin, Vedrana and Lu\v{s}trek, Mitja and Gams, Matja\v{z}}, year = {2009}, pages = {93-110}, keywords = {web genre classification, multi-label classification}, journal = {Journal for language technology and computational linguistics}, volume = {24}, number = {1}, issn = {0175-1336}, title = {Multi-label approaches to web genre identification}, keyword = {web genre classification, multi-label classification} }
@article{article, author = {Vidulin, Vedrana and Lu\v{s}trek, Mitja and Gams, Matja\v{z}}, year = {2009}, pages = {93-110}, keywords = {web genre classification, multi-label classification}, journal = {Journal for language technology and computational linguistics}, volume = {24}, number = {1}, issn = {0175-1336}, title = {Multi-label approaches to web genre identification}, keyword = {web genre classification, multi-label classification} }




Contrast
Increase Font
Decrease Font
Dyslexic Font