Anomaly Detection in Netflow Network Traffic Using Supervised Machine Learning Algorithms

Fosić, Igor; Žagar, Drago; Grgić, Krešimir; Križanović, Višnja

doi:10.1016/j.jii.2023.100466

Pregled bibliografske jedinice broj: 1241658

Anomaly Detection in Netflow Network Traffic Using Supervised Machine Learning Algorithms

Fosić, Igor; Žagar, Drago; Grgić, Krešimir; Križanović, Višnja

Anomaly Detection in Netflow Network Traffic Using Supervised Machine Learning Algorithms // Journal of industrial information integration, 33 (2023), 100466, 10 doi:10.1016/j.jii.2023.100466 (međunarodna recenzija, članak, znanstveni)

CROSBI ID: 1241658 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Anomaly Detection in Netflow Network Traffic Using Supervised Machine Learning Algorithms

Autori
Fosić, Igor ; Žagar, Drago ; Grgić, Krešimir ; Križanović, Višnja

Izvornik
Journal of industrial information integration (2467-964X) 33 (2023); 100466, 10

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
supervised algorithm ; machine learning ; anomaly classification ; NetFlow ; imbalanced dataset

Sažetak
Anomaly detection is an important method for monitoring network traffic where is important to successfully distinguish normal traffic from abnormal traffic. For this purpose, one could use the existing classification algorithms as a part of the machine learning (ML) process. In this paper, some of the classification algorithms (Stochastic Gradient Descent (SGD), Support Vector Machines (SVM), K-Nearest Neighbor (K-NN), Gaussian Naive Bayes (GNB), Decision Tree (DT), Random Forest (RF), AdaBoost (AB)) were tested on the public UNSW-NB15 dataset. Different encoding methods and ratios of training and test data resulted in the optimal parameters classifiers. Due to the imbalanced distribution of normal and abnormal network traffic data, both standard performance scores and additional classification performance scores (F2-score, Area Under ROC Curve (AUC)) were used, that better describe the obtained results. The RF Classifier with F2-score = 97.68% and AUC score = 98.47% obtained the best results using a representative subset within the original dataset due to the shorter duration of the computations. Features in the referential dataset were reduced by 82% and selected following the structure of the NetFlow data stream. Concerning similar studies, this paper compares several algorithms for anomaly detection and selects the best one for NetFlow data streams. The F2 and AUC metric is applied, which achieves very high accuracy compared to classic metrics that do not show realistic accuracy in imbalanced datasets. Less time was spent using Label enoding (LE) with the same accuracy compared to One-hot (OH) encoding used in similar research. The novelty introduced by this paper is in the optimization of the ML process and influence of the ratio of data for learning and testing, different encoding methods of categorical features, and feature reduction on the Netflow data streams

Izvorni jezik
Engleski

Znanstvena područja
Elektrotehnika, Informacijske i komunikacijske znanosti

POVEZANOST RADA

Ustanove:
Fakultet elektrotehnike, računarstva i informacijskih tehnologija Osijek

Profili:

Drago Žagar (autor)

Višnja Križanović (autor)

Krešimir Grgić (autor)

Poveznice na cjeloviti tekst rada:

doi papers.ssrn.com www.sciencedirect.com

Citiraj ovu publikaciju:

Časopis indeksira:

Current Contents Connect (CCC)
Web of Science Core Collection (WoSCC)

Science Citation Index Expanded (SCI-EXP)
SCI-EXP, SSCI i/ili A&HCI

Scopus

CROSBI Hrvatska znanstvena bibliografija

Pregled bibliografske jedinice broj: 1241658

Anomaly Detection in Netflow Network Traffic Using Supervised Machine Learning Algorithms

Poveznice na cjeloviti tekst rada:

Citiraj ovu publikaciju:

Časopis indeksira:

Citati:

Altmetrijski pokazatelji:

Pregled bibliografske jedinice broj: 1241658

Anomaly Detection in Netflow Network Traffic Using Supervised Machine Learning Algorithms

Poveznice na cjeloviti tekst rada:

Citiraj ovu publikaciju:

Časopis indeksira:

Citati:

Altmetrijski pokazatelji:

Podijeli: