Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 198152

Automated news item categorization


Bačan, Hrvoje; Gulija, Darko; Pandžić, Igor
Automated news item categorization // Proceedings of JSAI 2005 Workshop on Conversational Informatics, in conjunction with the 19th Annual Conference of The Japanese Society for Artificial Intelligence JSAI 2005 / Sumi, Yasuyuki ; Nishida, Toyoaki (ur.).
Kitakyushu: Kyoto University, 2005. str. 57-62 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 198152 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Automated news item categorization

Autori
Bačan, Hrvoje ; Gulija, Darko ; Pandžić, Igor

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of JSAI 2005 Workshop on Conversational Informatics, in conjunction with the 19th Annual Conference of The Japanese Society for Artificial Intelligence JSAI 2005 / Sumi, Yasuyuki ; Nishida, Toyoaki - Kitakyushu : Kyoto University, 2005, 57-62

Skup
JSAI 2005 Workshop on Conversational Informatics, in conjunction with the 19th Annual Conference of The Japanese Society for Artificial Intelligence JSAI 2005

Mjesto i datum
Kitakjūshū, Japan, 13.06.2005. - 14.06.2005

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
text categorization; machine learning; news categorization; IPTC

Sažetak
We present a system for automatic categorization of news items into a standard set of categories. The system has been built specifically for news stories written in Croatian language. It uses the standard set of news categories established by the International Press Telecommunications Council (IPTC). The algorithm used for categorization transforms each document into a vector of weights corresponding to an automatically chosen set of keywords. This process is performed on a large training set of news items, forming the multi-dimensional space populated by news items of known categories. An unknown news item is also transformed into a vector of keyword weights and then categorized using the k-NN method in this space. The has been trained on the collection of approx. 2700 manually categorized news provided by the Croatian News Agency and tested on a different set of approx. 500 randomly chosen news items from the same source. The automatic categorization gave a correct result for 85% of tested news items.

Izvorni jezik
Engleski

Znanstvena područja
Elektrotehnika, Računarstvo, Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekti:
0036060

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Avatar Url Igor Sunday Pandžić (autor)


Citiraj ovu publikaciju:

Bačan, Hrvoje; Gulija, Darko; Pandžić, Igor
Automated news item categorization // Proceedings of JSAI 2005 Workshop on Conversational Informatics, in conjunction with the 19th Annual Conference of The Japanese Society for Artificial Intelligence JSAI 2005 / Sumi, Yasuyuki ; Nishida, Toyoaki (ur.).
Kitakyushu: Kyoto University, 2005. str. 57-62 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Bačan, H., Gulija, D. & Pandžić, I. (2005) Automated news item categorization. U: Sumi, Y. & Nishida, T. (ur.)Proceedings of JSAI 2005 Workshop on Conversational Informatics, in conjunction with the 19th Annual Conference of The Japanese Society for Artificial Intelligence JSAI 2005.
@article{article, author = {Ba\v{c}an, Hrvoje and Gulija, Darko and Pand\v{z}i\'{c}, Igor}, year = {2005}, pages = {57-62}, keywords = {text categorization, machine learning, news categorization, IPTC}, title = {Automated news item categorization}, keyword = {text categorization, machine learning, news categorization, IPTC}, publisher = {Kyoto University}, publisherplace = {Kitakj\={u}sh\={u}, Japan} }
@article{article, author = {Ba\v{c}an, Hrvoje and Gulija, Darko and Pand\v{z}i\'{c}, Igor}, year = {2005}, pages = {57-62}, keywords = {text categorization, machine learning, news categorization, IPTC}, title = {Automated news item categorization}, keyword = {text categorization, machine learning, news categorization, IPTC}, publisher = {Kyoto University}, publisherplace = {Kitakj\={u}sh\={u}, Japan} }




Contrast
Increase Font
Decrease Font
Dyslexic Font