Pregled bibliografske jedinice broj: 1013026
Standardization and unification of content generation and management in knowledge base applications
Standardization and unification of content generation and management in knowledge base applications, 2019., diplomski rad, diplomski, Odjel za informatiku, Rijeka
CROSBI ID: 1013026 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Standardization and unification of content generation and management in knowledge base applications
Autori
Barić, Ivana
Vrsta, podvrsta i kategorija rada
Ocjenski radovi, diplomski rad, diplomski
Fakultet
Odjel za informatiku
Mjesto
Rijeka
Datum
12.07
Godina
2019
Stranica
81
Mentor
Martinčić-Ipšić, Sanda
Ključne riječi
standardization, unification, knowledge management, knowledge base, language processing, NLP, keyword extraction, machine learning, Gephi, Weka, MicroStrategy, NetworkX, decision trees, random forest.
Sažetak
In this master thesis, the process of standardization and unification of content generation and management in knowledge base applications is described. The theme includes methods of computer language processing that helps the user to admit new knowledge that's based on already familiar knowledge. The processes of unification of new knowledge were also studied through transformation, mapping, deduplication and export for easier search and analysis of written knowledge or data. The introduction describes the domain in which the computer (natural) language processing, machine learning, various procedures, and techniques are used. In the description of the problem, there is a brief overview of the specific problem that will be solved using the mentioned procedures and the corresponding software tools. Procedures for extracting information from data are described with the help of the appropriate tools and methods of computer language processing such as Python as well as the methods used in the decision support systems and machine learning. Also, tools like MicroStrategy, Gephi and Weka that were used in visualizing the results are described. Based on the information obtained, possible solutions to the problem of this master thesis are presented. The data used in this master thesis is taken from the knowledge base of a real business organization, but because of their confidentiality and the fact that their content is a strict business secret, the data itself has been modified accordingly. In the first phase, a part of the solution that focuses on analysis and monitoring of data quality and the classification of the basic elements to their corresponding classes is presented. The second phase focuses on establishing data quality standards for entering data through the application. The third phase is the last part of the solution and it uses keyword extraction methods to visually present the most commonly used elements in describing a specific device.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti
POVEZANOST RADA
Ustanove:
Fakultet informatike i digitalnih tehnologija, Rijeka
Profili:
Sanda Martinčić - Ipšić
(mentor)