Using machine learning for anomaly detection in streaming data (CROSBI ID 699099)
Prilog sa skupa u zborniku | izvorni znanstveni rad
Podaci o odgovornosti
Majić, Stefani ; Zekić Sušac, Marijana ; Has, Adela
engleski
Using machine learning for anomaly detection in streaming data
Lately, there has been an enormous increase in the amount and availability of streaming data which brings new technological challenges and opportunities. Streaming data is a real-time, continuous sequence of items that are ordered implicitly by arrival time or explicitly by timestamp. The actuality of this paper lies in an ever-increasing amount of available data because of the increase in using Big Data and the Internet of Things. The aim of this paper is to explore the use of machine learning algorithms for detecting anomalies in streaming data. Early anomaly detection is valuable, but hard to perform reliably in practice. The paper presents an overview of previous research in this area, algorithms, tools, and methods as well as the problems of deploying and implementing machine learning algorithms on streaming data. In the empirical part of the paper, the support vector machine and principal component analysis were performed to identify anomalies in streaming data. The research has been conducted on two datasets to cover two of the biggest areas for anomaly detection- computer security and IoT sensors (HVAC). The importance of this research lies in its implications for industries. Detecting threats within network traffic as well as learning detecting anomaly in sensing readers would provide many useful features in logistics, marketing, advertising and game development industry. One of the advantages of the support vector machines is that they can be applied to a set of data with a low proportion of anomalies since in real life no system would allow a large number of abnormal data due to security and high costs. The created model shows an accuracy of 96, 4% on validation data. Although this area has been explored for quite some time, there is still plenty of room for development if the exponential increase in data volume is considered. Therefore, it can be assumed that this topic is still to be discussed.
: machine learning, anomaly detection, streaming data, Support vector machines, PCA
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
125-136.
2019.
objavljeno
Podaci o matičnoj publikaciji
Bobcatsss 2019. Information and technology transforming lives: connection, interaction, innovation
Gašo, Gordana ; Gilman Ranogajec, Mirna ; Žilić, Jure ; Lundman, Madeleine
Osijek:
978-953-314-121-3
Podaci o skupu
27th Bobcatsss Symposium Information and technology transforming lives: connection, interaction, innovation (Bobcatsss 2019)
predavanje
22.01.2019-24.01.2019
Osijek, Hrvatska