Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi

Anomaly Detection in Netflow Network Traffic Using Supervised Machine Learning Algorithms (CROSBI ID 318605)

Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija

Fosić, Igor ; Žagar, Drago ; Grgić, Krešimir ; Križanović, Višnja Anomaly Detection in Netflow Network Traffic Using Supervised Machine Learning Algorithms // Journal of industrial information integration, 33 (2023), 100466, 10. doi: 10.1016/j.jii.2023.100466

Podaci o odgovornosti

Fosić, Igor ; Žagar, Drago ; Grgić, Krešimir ; Križanović, Višnja

engleski

Anomaly Detection in Netflow Network Traffic Using Supervised Machine Learning Algorithms

Anomaly detection is an important method for monitoring network traffic where is important to successfully distinguish normal traffic from abnormal traffic. For this purpose, one could use the existing classification algorithms as a part of the machine learning (ML) process. In this paper, some of the classification algorithms (Stochastic Gradient Descent (SGD), Support Vector Machines (SVM), K-Nearest Neighbor (K-NN), Gaussian Naive Bayes (GNB), Decision Tree (DT), Random Forest (RF), AdaBoost (AB)) were tested on the public UNSW-NB15 dataset. Different encoding methods and ratios of training and test data resulted in the optimal parameters classifiers. Due to the imbalanced distribution of normal and abnormal network traffic data, both standard performance scores and additional classification performance scores (F2-score, Area Under ROC Curve (AUC)) were used, that better describe the obtained results. The RF Classifier with F2-score = 97.68% and AUC score = 98.47% obtained the best results using a representative subset within the original dataset due to the shorter duration of the computations. Features in the referential dataset were reduced by 82% and selected following the structure of the NetFlow data stream. Concerning similar studies, this paper compares several algorithms for anomaly detection and selects the best one for NetFlow data streams. The F2 and AUC metric is applied, which achieves very high accuracy compared to classic metrics that do not show realistic accuracy in imbalanced datasets. Less time was spent using Label enoding (LE) with the same accuracy compared to One-hot (OH) encoding used in similar research. The novelty introduced by this paper is in the optimization of the ML process and influence of the ratio of data for learning and testing, different encoding methods of categorical features, and feature reduction on the Netflow data streams

supervised algorithm ; machine learning ; anomaly classification ; NetFlow ; imbalanced dataset

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o izdanju

33

2023.

100466

10

objavljeno

2467-964X

2452-414X

10.1016/j.jii.2023.100466

Povezanost rada

Elektrotehnika, Informacijske i komunikacijske znanosti

Poveznice
Indeksiranost