Napredna pretraga

Pregled bibliografske jedinice broj: 581314

Corpus Analysis with NooJ


Vučković, Kristina; Silberztein, Max; Varadi, Tamas
Corpus Analysis with NooJ, 2012. (radionica).


Naslov
Corpus Analysis with NooJ

Autori
Vučković, Kristina ; Silberztein, Max ; Varadi, Tamas

Izvornik
Http://www.lrec-conf.org/lrec2012/?LREC-2012-Tutorial-Material

Vrsta, podvrsta
Ostale vrste radova, radionica

Godina
2012

Ključne riječi
Corpus processing ; linguistic units ; queries ; annotations ; morphology ; syntax

Sažetak
NooJ is a freeware language-engineering development environment used to formalize and integrate nine levels of linguistic phenomena: orthography and typography, lexical, inflectional and derivational morphology, local, structural and transformational syntax, semantics. For each of these levels, NooJ provides linguists with one or more formal framework specifically designed to facilitate the description of each phenomenon, as well as parsing, development and debugging tools designed to be as computationally efficient as possible, from Finite-State to Turing machines. This approach distinguishes NooJ from other computational linguistic frameworks that provide a unique formalism that is supposed to cover all linguistic phenomena. As an Engineering development environment, NooJ contains tools to help construct, test, debug, maintain and accumulate large sets of linguistic resources, as well as tools to process large texts and corpora. The system has been developed since 2002 and it has been used to build over 20 language modules. As a corpus processing tool, NooJ allows researchers in various social sciences to extract information from any text or corpus (i.e. not tagged) by applying sophisticated queries based on concepts rather than word forms and build indices and concordances, automatically annotating texts, perform statistical analyses on concepts, etc. NooJ is freely available, runs on Windows, LINUX, SOLARIS and Mac OSX ; linguistic modules can already be freely downloaded for over a dozen languages. See www.nooj4nlp.net for more information on NooJ ; the page “doc & help” provides references to NooJ-related publications. This workshop intends to help participants to master three basic NooJ functionalities: corpus processing, formalization of linguistic units, syntactic parsing and the automatic annotation of texts.

Izvorni jezik
Engleski

Znanstvena područja
Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekt / tema
130-1300646-1776 - Računalna sintaksa hrvatskoga jezika (Zdravko Dovedan Han, )

Ustanove
Filozofski fakultet, Zagreb

Autor s matičnim brojem:
Kristina Kocijan, (256436)