Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 203051

Extracting most frequent Croatian root words using digram comparison and latent semantic analysis


Radoš, Zvonimir; Jović, Franjo; Job, Josip
Extracting most frequent Croatian root words using digram comparison and latent semantic analysis // Proceedings of the 7th International Conference on Enterprise Information Systems (ICEIS 2005) : proceedings
Miami (FL), Sjedinjene Američke Države, 2005. str. 370-373 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 203051 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Extracting most frequent Croatian root words using digram comparison and latent semantic analysis

Autori
Radoš, Zvonimir ; Jović, Franjo ; Job, Josip

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of the 7th International Conference on Enterprise Information Systems (ICEIS 2005) : proceedings / - , 2005, 370-373

Skup
International Conference on Enterprise Information Systems (7 ; 2005)

Mjesto i datum
Miami (FL), Sjedinjene Američke Države, 24.05.2005. - 28.05.2005

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
morphological analysis; LSA; word tree; stem; root word; knowledge-free

Sažetak
A method for extracting root words from Croatian language text is presented. The described method is knowledge-free and can be applied to any language. Morphological and semantic aspects of the language were used. The algorithm creates morph-semantic groups of words and extract common root for every group. For morphological grouping we use digram comparison to group words depending on their morphological similarity. Latent semantic analysis is applied to split morphological groups into semantic subgroups of words. Root words are extracted from every morpho-semantic group. When applied to Croatian language text, among hundred most frequent root words, produced by this algorithm, there were 60 grammatically correct ones and 25 FAP (for all practical purposes) correct root words.

Izvorni jezik
Engleski

Znanstvena područja
Elektrotehnika, Računarstvo, Filologija

Napomena
ISBN 972-8865-19-8



POVEZANOST RADA


Projekti:
0165101 - Industrijski sustavi umjetne inteligencije (Jović, Franjo, MZOS ) ( CroRIS)

Ustanove:
Fakultet elektrotehnike, računarstva i informacijskih tehnologija Osijek

Profili:

Avatar Url Josip Job (autor)

Avatar Url Franjo Jović (autor)


Citiraj ovu publikaciju:

Radoš, Zvonimir; Jović, Franjo; Job, Josip
Extracting most frequent Croatian root words using digram comparison and latent semantic analysis // Proceedings of the 7th International Conference on Enterprise Information Systems (ICEIS 2005) : proceedings
Miami (FL), Sjedinjene Američke Države, 2005. str. 370-373 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Radoš, Z., Jović, F. & Job, J. (2005) Extracting most frequent Croatian root words using digram comparison and latent semantic analysis. U: Proceedings of the 7th International Conference on Enterprise Information Systems (ICEIS 2005) : proceedings.
@article{article, author = {Rado\v{s}, Zvonimir and Jovi\'{c}, Franjo and Job, Josip}, year = {2005}, pages = {370-373}, keywords = {morphological analysis, LSA, word tree, stem, root word, knowledge-free}, title = {Extracting most frequent Croatian root words using digram comparison and latent semantic analysis}, keyword = {morphological analysis, LSA, word tree, stem, root word, knowledge-free}, publisherplace = {Miami (FL), Sjedinjene Ameri\v{c}ke Dr\v{z}ave} }
@article{article, author = {Rado\v{s}, Zvonimir and Jovi\'{c}, Franjo and Job, Josip}, year = {2005}, pages = {370-373}, keywords = {morphological analysis, LSA, word tree, stem, root word, knowledge-free}, title = {Extracting most frequent Croatian root words using digram comparison and latent semantic analysis}, keyword = {morphological analysis, LSA, word tree, stem, root word, knowledge-free}, publisherplace = {Miami (FL), Sjedinjene Ameri\v{c}ke Dr\v{z}ave} }




Contrast
Increase Font
Decrease Font
Dyslexic Font