Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 775581

Using machine learning for language and structure annotation in an 18th century dictionary


Bago, Petra; Ljubešić, Nikola
Using machine learning for language and structure annotation in an 18th century dictionary // Proceedings of the Electronic lexicography in the 21st century 2015 conference / Kosem, Iztok ; Jakubíček, Miloš ; Kallas, Jelena ; Krek, Simon (ur.).
Ljubljana : Brighton: Trojina, Institute for Applied Slovene Studies/Lexical Computing Ltd., 2015. str. 427-442 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 775581 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Using machine learning for language and structure annotation in an 18th century dictionary

Autori
Bago, Petra ; Ljubešić, Nikola

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of the Electronic lexicography in the 21st century 2015 conference / Kosem, Iztok ; Jakubíček, Miloš ; Kallas, Jelena ; Krek, Simon - Ljubljana : Brighton : Trojina, Institute for Applied Slovene Studies/Lexical Computing Ltd., 2015, 427-442

ISBN
978-961-93594-3-3

Skup
Electronic lexicography in the 21st century: linking lexical data in the digital age

Mjesto i datum
Hailsham, Ujedinjeno Kraljevstvo, 11.08.2015. - 13.08.2015

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
historical dictionaries; language annotation; structure annotation; supervised machine learning

Sažetak
The accessibility of digitized historical texts is increasing, which, consequently, has resulted in a growing interest in applying machine learning methods to enrich this type of content. The need for applying machine learning is even greater than in modern texts given the high level of inconsistency in historical texts even within the same document. In this paper we investigate the application of a supervised structural machine learning method on language and structure annotation of 18th century dictionary entries. Our research is conducted on the first volume of a trilingual dictionary ‘Dizionario italiano–latino–illirico’ (Italian–Latin–Croatian Dictionary) compiled by Ardellio della Bella and printed in Dubrovnik in 1785. We assume that by using this method, we can significantly reduce time for manual annotation and simplify the process for the annotators. We reach accuracy of approximately 98% for language annotation and around 96% for structure annotation. A final experiment on the time gain obtained by pre-annotating the data shows that only correcting the generated labels is roughly five times faster than full manual annotation.

Izvorni jezik
Engleski

Znanstvena područja
Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekti:
130-1301679-1380 - Hrvatska rječnička baština i hrvatski europski identitet (Boras, Damir, MZOS ) ( CroRIS)

Ustanove:
Filozofski fakultet, Zagreb

Profili:

Avatar Url Petra Bago (autor)

Avatar Url Nikola Ljubešić (autor)

Poveznice na cjeloviti tekst rada:

Pristup cjelovitom tekstu rada elex.link

Citiraj ovu publikaciju:

Bago, Petra; Ljubešić, Nikola
Using machine learning for language and structure annotation in an 18th century dictionary // Proceedings of the Electronic lexicography in the 21st century 2015 conference / Kosem, Iztok ; Jakubíček, Miloš ; Kallas, Jelena ; Krek, Simon (ur.).
Ljubljana : Brighton: Trojina, Institute for Applied Slovene Studies/Lexical Computing Ltd., 2015. str. 427-442 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Bago, P. & Ljubešić, N. (2015) Using machine learning for language and structure annotation in an 18th century dictionary. U: Kosem, I., Jakubíček, M., Kallas, J. & Krek, S. (ur.)Proceedings of the Electronic lexicography in the 21st century 2015 conference.
@article{article, author = {Bago, Petra and Ljube\v{s}i\'{c}, Nikola}, year = {2015}, pages = {427-442}, keywords = {historical dictionaries, language annotation, structure annotation, supervised machine learning}, isbn = {978-961-93594-3-3}, title = {Using machine learning for language and structure annotation in an 18th century dictionary}, keyword = {historical dictionaries, language annotation, structure annotation, supervised machine learning}, publisher = {Trojina, Institute for Applied Slovene Studies/Lexical Computing Ltd.}, publisherplace = {Hailsham, Ujedinjeno Kraljevstvo} }
@article{article, author = {Bago, Petra and Ljube\v{s}i\'{c}, Nikola}, year = {2015}, pages = {427-442}, keywords = {historical dictionaries, language annotation, structure annotation, supervised machine learning}, isbn = {978-961-93594-3-3}, title = {Using machine learning for language and structure annotation in an 18th century dictionary}, keyword = {historical dictionaries, language annotation, structure annotation, supervised machine learning}, publisher = {Trojina, Institute for Applied Slovene Studies/Lexical Computing Ltd.}, publisherplace = {Hailsham, Ujedinjeno Kraljevstvo} }




Contrast
Increase Font
Decrease Font
Dyslexic Font