Pregled bibliografske jedinice broj: 69540
Integral Business Intelligence System for the Croatian Language: Proper Name Recognition Module. The Fifth International Conference: Information Technology and Journalism - Journalism - The Next Step, Dubrovnik, Inter University Center, 22 - 26 May 2000 (pozvano predavanje)
Integral Business Intelligence System for the Croatian Language: Proper Name Recognition Module. The Fifth International Conference: Information Technology and Journalism - Journalism - The Next Step, Dubrovnik, Inter University Center, 22 - 26 May 2000 (pozvano predavanje), 1999. (ostalo).
CROSBI ID: 69540 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Integral Business Intelligence System for the Croatian Language: Proper Name Recognition Module. The Fifth International Conference: Information Technology and Journalism - Journalism - The Next Step, Dubrovnik, Inter University Center, 22 - 26 May 2000 (pozvano predavanje)
(Integral Business Intelligence System for the Croatian Language: Proper Name Recognition Module. The Fifth International Conference: Information Technology and Journalism - Journalism - The Next Step, Dubrovnik, Inter University Center, 22 - 26 May 2000)
Autori
Boras, Damir ; Lauc, Tomislava ; Lauc, Davor.
Izvornik
Integral Business Intelligence System for the Croatian Language: Proper Name Recognition Module.
Vrsta, podvrsta
Ostale vrste radova, ostalo
Godina
1999
Ključne riječi
names recognition; proper name; inflectional database
Sažetak
The aim of the work was to to build a database consisting of all existing names today in Croatia in all different word-forms in accordance with Croatian language rules and to set up rules for combination of proper names with family names in the Croatian language. Because we couldnt get access to the social security data base or the data base of the ministery of interiors we were forced to use other publicly available sources which are much less accurate then the mentioned data bases. In the future, we hope to get access to those database and improve our inflectional database. This databes consist of 9538 different male names (682,283 occurences), 8963 female names (568,703 occurences) and 75,298 family names (1,251,106 occurences) in all possible word forms. Until now there was no lexical (and inflectional) database for proper names, although there are several ordinary lexical and inflectional data bases for Croatian language.
This data base can be used as additional source for Croatian spelling checker preparation, as generative data base for all possible forms of Croatian names and/or as searching aid, to search for all forms of certain name, and/or as a proper name recognition module.
Izvorni jezik
Engleski
Znanstvena područja
Informacijske i komunikacijske znanosti