Pregled bibliografske jedinice broj: 997715
An Online Syntactic and Semantic Framework for Lexical Relations Extraction Using Natural Language Deterministic Model
An Online Syntactic and Semantic Framework for Lexical Relations Extraction Using Natural Language Deterministic Model, 2019., doktorska disertacija, Fakultet organizacije i informatike, Zagreb doi:10.13140/RG.2.2.31092.19849
CROSBI ID: 997715 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
An Online Syntactic and Semantic Framework for Lexical Relations Extraction Using Natural Language Deterministic Model
Autori
Orešković, Marko
Vrsta, podvrsta i kategorija rada
Ocjenski radovi, doktorska disertacija
Fakultet
Fakultet organizacije i informatike
Mjesto
Zagreb
Datum
15.03
Godina
2019
Stranica
237
Mentor
Čubrilo, Mirko ; Essert, Mario
Ključne riječi
syntax analysis, semantic analysis, lexical relations extraction, new lexicon types, hierarchical tagset structure, linked open data
Sažetak
Given the extraordinary growth in online documents, methods for automated extraction of semantic relations became popular, and shortly after, became necessary. This thesis proposes a new deterministic language model, with the associated artifact, which acts as an online Syntactic and Semantic Framework (SSF) for the extraction of morphosyntactic and semantic relations. The model covers all fundamental linguistic fields: Morphology (formation, composition, and word paradigms), Lexicography (storing words and their features in network lexicons), Syntax (the composition of words in meaningful parts: phrases, sentences, and pragmatics), and Semantics (determining the meaning of phrases). To achieve this, a new tagging system with more complex structures was developed. Instead of the commonly used vectored systems, this new tagging system uses tree-like T-structures with hierarchical, grammatical Word of Speech (WOS), and Semantic of Word (SOW) tags. For relations extraction, it was necessary to develop a syntactic (sub)model of language, which ultimately is the foundation for performing semantic analysis. This was achieved by introducing a new `O-structure', which represents the union of WOS/SOW features from T- structures of words and enables the creation of syntagmatic patterns. Such patterns are a powerful mechanism for the extraction of conceptual structures (e.g., metonymies, similes, or metaphors), breaking sentences into main and subordinate clauses, or detection of a sentence’s main construction parts (subject, predicate, and object). Since all program modules are developed as general and generative entities, SSF can be used for any of the Indo- European languages, although validation and network lexicons have been developed for the Croatian language only. The SSF has three types of lexicons (morphs/syllables, words, and multi- word expressions), and the main words lexicon is included in the Global Linguistic Linked Open Data (LLOD) Cloud, allowing interoperability with all other world languages. The SSF model and its artifact represent a complete natural language model which can be used to extract the lexical relations from single sentences, paragraphs, and also from large collections of documents.
Izvorni jezik
Engleski
Znanstvena područja
Informacijske i komunikacijske znanosti
POVEZANOST RADA
Ustanove:
Fakultet organizacije i informatike, Varaždin