Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 1103795

ReSiPC: a Tool for Complex Searches in Parallel Corpora


Oliver, Antoni; Mikelenić, Bojana
ReSiPC: a Tool for Complex Searches in Parallel Corpora // Proceedings of The 12th Language Resources and Evaluation Conference / Calzolari, Nicoletta ; Béchet, Frédéric ; Blache, Philippe ; Choukri, Khalid ; Cieri, Christopher ; Declerck, Thierry ; Goggi, Sara ; Isahara, Hitoshi ; Maegaard, Bente ; Mariani, Joseph ; Mazo, Hélène ; Moreno, Asuncion ; Odijk, Jan ; Piperidis, Stelios (ur.).
Marseille: European Language Resources Association (ELRA), 2020. str. 7033-7037 (poster, međunarodna recenzija, cjeloviti rad (in extenso), ostalo)


CROSBI ID: 1103795 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
ReSiPC: a Tool for Complex Searches in Parallel Corpora

Autori
Oliver, Antoni ; Mikelenić, Bojana

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), ostalo

Izvornik
Proceedings of The 12th Language Resources and Evaluation Conference / Calzolari, Nicoletta ; Béchet, Frédéric ; Blache, Philippe ; Choukri, Khalid ; Cieri, Christopher ; Declerck, Thierry ; Goggi, Sara ; Isahara, Hitoshi ; Maegaard, Bente ; Mariani, Joseph ; Mazo, Hélène ; Moreno, Asuncion ; Odijk, Jan ; Piperidis, Stelios - Marseille : European Language Resources Association (ELRA), 2020, 7033-7037

ISBN
979-10-95546-34-4

Skup
The 12th Language Resources and Evaluation Conference (LREC2020)

Mjesto i datum
Marseille, Francuska, 11.05.2020. - 16.05.2020

Vrsta sudjelovanja
Poster

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
parallel corpora ; regular expressions ; contrastive linguistics

Sažetak
In this paper, a tool specifically designed to allow for complex searches in large parallel corpora is presented. The formalism for the queries is very powerful as it uses standard regular expressions that allow for complex queries combining word forms, lemmata and POS- tags. As queries are performed over POS-tags, at least one of the languages in the parallel corpus should be POS-tagged. Searches can be performed in one of the languages or in both languages at the same time. The program is able to POS-tag the corpora using the Freeling analyzer through its Python API. ReSiPC is developed in Python version 3 and it is distributed under a free license (GNU GPL). The tool can be used to provide data for contrastive linguistics research and an example of use in a Spanish-Croatian parallel corpus is presented. ReSiPC is designed for queries in POS-tagged corpora, but it can be easily adapted for querying corpora containing other kinds of information.

Izvorni jezik
Engleski

Znanstvena područja
Informacijske i komunikacijske znanosti, Filologija

Napomena
Zbog pandemije krunastoga virusa, kongres nije
održan, ali je zbornik radova objavljen 2020-05-
15.



POVEZANOST RADA


Ustanove:
Filozofski fakultet, Zagreb

Profili:

Avatar Url Bojana Mikelenić (autor)

Poveznice na cjeloviti tekst rada:

www.lrec-conf.org

Citiraj ovu publikaciju:

Oliver, Antoni; Mikelenić, Bojana
ReSiPC: a Tool for Complex Searches in Parallel Corpora // Proceedings of The 12th Language Resources and Evaluation Conference / Calzolari, Nicoletta ; Béchet, Frédéric ; Blache, Philippe ; Choukri, Khalid ; Cieri, Christopher ; Declerck, Thierry ; Goggi, Sara ; Isahara, Hitoshi ; Maegaard, Bente ; Mariani, Joseph ; Mazo, Hélène ; Moreno, Asuncion ; Odijk, Jan ; Piperidis, Stelios (ur.).
Marseille: European Language Resources Association (ELRA), 2020. str. 7033-7037 (poster, međunarodna recenzija, cjeloviti rad (in extenso), ostalo)
Oliver, A. & Mikelenić, B. (2020) ReSiPC: a Tool for Complex Searches in Parallel Corpora. U: Calzolari, N., Béchet, F., Blache, P., Choukri, K., Cieri, C., Declerck, T., Goggi, S., Isahara, H., Maegaard, B., Mariani, J., Mazo, H., Moreno, A., Odijk, J. & Piperidis, S. (ur.)Proceedings of The 12th Language Resources and Evaluation Conference.
@article{article, author = {Oliver, Antoni and Mikeleni\'{c}, Bojana}, year = {2020}, pages = {7033-7037}, keywords = {parallel corpora, regular expressions, contrastive linguistics}, isbn = {979-10-95546-34-4}, title = {ReSiPC: a Tool for Complex Searches in Parallel Corpora}, keyword = {parallel corpora, regular expressions, contrastive linguistics}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Marseille, Francuska} }
@article{article, author = {Oliver, Antoni and Mikeleni\'{c}, Bojana}, year = {2020}, pages = {7033-7037}, keywords = {parallel corpora, regular expressions, contrastive linguistics}, isbn = {979-10-95546-34-4}, title = {ReSiPC: a Tool for Complex Searches in Parallel Corpora}, keyword = {parallel corpora, regular expressions, contrastive linguistics}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Marseille, Francuska} }




Contrast
Increase Font
Decrease Font
Dyslexic Font