Napredna pretraga

Pregled bibliografske jedinice broj: 69540

Integral Business Intelligence System for the Croatian Language: Proper Name Recognition Module. The Fifth International Conference: Information Technology and Journalism - Journalism - The Next Step, Dubrovnik, Inter University Center, 22 - 26 May 2000 (pozvano predavanje)


Boras, Damir; Lauc, Tomislava; Lauc, Davor.
Integral Business Intelligence System for the Croatian Language: Proper Name Recognition Module. The Fifth International Conference: Information Technology and Journalism - Journalism - The Next Step, Dubrovnik, Inter University Center, 22 - 26 May 2000 (pozvano predavanje), 1999. (pozvano predavanje).


Naslov
Integral Business Intelligence System for the Croatian Language: Proper Name Recognition Module. The Fifth International Conference: Information Technology and Journalism - Journalism - The Next Step, Dubrovnik, Inter University Center, 22 - 26 May 2000 (pozvano predavanje)
(Integral Business Intelligence System for the Croatian Language: Proper Name Recognition Module. The Fifth International Conference: Information Technology and Journalism - Journalism - The Next Step, Dubrovnik, Inter University Center, 22 - 26 May 2000)

Autori
Boras, Damir ; Lauc, Tomislava ; Lauc, Davor.

Izvornik
Integral Business Intelligence System for the Croatian Language: Proper Name Recognition Module.

Vrsta, podvrsta
Ostale vrste radova, pozvano predavanje

Godina
1999

Ključne riječi
Names recognition; proper name; inflectional database

Sažetak
The aim of the work was to to build a database consisting of all existing names today in Croatia in all different word-forms in accordance with Croatian language rules and to set up rules for combination of proper names with family names in the Croatian language. Because we couldn’t get access to the social security data base or the data base of the ministery of interiors we were forced to use other publicly available sources which are much less accurate then the mentioned data bases. In the future, we hope to get access to those database and improve our inflectional database. This databes consist of 9538 different male names (682,283 occurences), 8963 female names (568,703 occurences) and 75,298 family names (1,251,106 occurences) in all possible word forms. Until now there was no lexical (and inflectional) database for proper names, although there are several ordinary lexical and inflectional data bases for Croatian language. This data base can be used as additional source for Croatian spelling checker preparation, as generative data base for all possible forms of Croatian names and/or as searching aid, to search for all forms of certain name, and/or as a proper name recognition module.

Izvorni jezik
Engleski

Znanstvena područja
Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekt / tema
130743

Ustanove
Filozofski fakultet, Zagreb

Autor s matičnim brojem:
Tomislava Lauc, (200814)
Damir Boras, (4513)