AI-Driven Fast and Early Detection of IoT Botnet Threats: A Comprehensive Network Traffic Analysis Approach

Abdelaziz Amara korba, Aleddine Diaf, Yacine Ghamri-Doudane·July 22, 2024

Summary

The study focuses on early detection of IoT botnet threats, particularly stealth bot communication preceding attacks, by proposing a comprehensive network traffic analysis approach. It explores network features critical for representing traffic and characterizing benign IoT patterns, using semi-supervised learning techniques to model traffic. The research demonstrates the feasibility of detecting botnet traffic with a 100% success rate through packet-based methods and 94% via flow-based approaches, with a false positive rate of 1.53%. The study addresses the growing cyber threat landscape, emphasizing the need for proactive measures to prevent botnet attacks before they materialize. The research aims to minimize detection delay, crucial for limiting the impact of infection and preventing bot malware spread. The text discusses the challenges in detecting botnet attacks, particularly the reliance on supervised learning methods that require malicious traffic for training, which is often not available in real-world scenarios. This limits the model's ability to identify unknown botnet traffic and new threats. The study aims to explore semi-supervised learning methods that do not require malicious traffic for training. It addresses the need for accurate recognition of normal traffic patterns and the difficulty in distinguishing between normal and malicious traffic, especially during stealthy communication phases. To overcome this, the study investigates semi-supervised learning techniques, focusing on one-class classification methods, to model normal network behavior and detect a wide range of bot types. The text outlines a network traffic analysis methodology using semi-supervised learning techniques. It focuses on botnet detection through flow and packet-based formats, excluding sensitive details to protect user privacy. Network features are categorized into packet-based, byte-based, time-based, and protocol-based metrics. A filter feature selection method is employed, using five criteria: Spectral Score, Information Score, Pearson Correlation, Intra-class Distance, and Interquartile Range. These criteria help refine the feature set for more accurate botnet detection. The text discusses a methodology for detecting botnet network traffic anomalies using semi-supervised learning techniques. Five semi-supervised learning algorithms are evaluated: Isolation Forest, Elliptic Envelope, Local Outlier Factor, One-Class SVM, and Deep Autoencoders. The Aposemat IoT-23 dataset, sourced from the Stratosphere Laboratory at CTU University, is used for the study. The dataset contains 23 scenarios of IoT network traffic, including real malware infections and benign traffic. The study demonstrates the ability to detect bots at early stages with a detection delay of less than 1 second in packet-based traffic, achieving a perfect detection rate and a false positive rate (FPR) under 2%. For unidirectional flow traffic, a 98% detection rate is achieved with around 2% FPR. The study evaluates semi-supervised learning approaches, focusing on One-Class SVM and Autoencoder methods, for modeling normal Internet of Things (IoT) traffic patterns. The results confirm the efficacy of these techniques in accurately detecting botnet activities, including stealth network traffic like scanning and command-and-control (C2) communications. The study demonstrates the ability to detect bots at early stages with a detection delay of less than 1 second in packet-based traffic, achieving a perfect detection rate and a false positive rate (FPR) under 2%. For unidirectional flow traffic, a 98% detection rate is achieved with around 2% FPR. The study concludes that effectively modeling normal network traffic for IoT devices is feasible using packet-based and unidirectional flow formats, alongside optimized Time-Based and Protocol-Based features. In conclusion, the study presents a comprehensive approach to early detection of IoT botnet threats, utilizing semi-supervised learning techniques for network traffic analysis. It demonstrates high detection rates and low false positive rates, emphasizing the importance of proactive measures in preventing botnet attacks. The research contributes to the field of cybersecurity by providing a robust methodology for detecting botnet activities, particularly during stealthy communication phases, and highlights the effectiveness of packet-based and unidirectional flow formats in botnet detection.

Tables

Introduction

Background

Overview of the growing cyber threat landscape

Importance of early detection in preventing botnet attacks

Objective

Aim of the research: proposing a network traffic analysis approach for early detection of IoT botnet threats

Focus on stealth bot communication preceding attacks

Method

Data Collection

Sources of network traffic data

Methods for collecting data on IoT devices

Data Preprocessing

Techniques for cleaning and preparing data for analysis

Feature extraction from collected network traffic

Network Feature Representation

Categorization of network features into packet-based, byte-based, time-based, and protocol-based metrics

Selection of relevant features using filter methods

Semi-Supervised Learning Techniques

Overview of semi-supervised learning methods

Application of one-class classification methods for modeling normal network behavior

Evaluation

Dataset

Description of the Aposemat IoT-23 dataset

Source: Stratosphere Laboratory at CTU University

Evaluation Metrics

Detection rate

False positive rate (FPR)

Detection delay

Algorithm Evaluation

Comparison of five semi-supervised learning algorithms

Performance metrics for each algorithm

Results

Detection Performance

Detection rates for packet-based and flow-based traffic

False positive rates for each traffic format

Detection delay for packet-based traffic

Semi-Supervised Learning Techniques

Evaluation of One-Class SVM and Autoencoder methods

Results on modeling normal IoT traffic patterns

Conclusion

Summary of Findings

High detection rates and low false positive rates achieved

Feasibility of detecting botnet activities, including stealth network traffic

Importance of packet-based and unidirectional flow formats in botnet detection

Contributions to Cybersecurity

Robust methodology for detecting botnet threats

Emphasis on proactive measures in preventing botnet attacks

Effectiveness of semi-supervised learning techniques in network traffic analysis

Basic info

papers

cryptography and security

artificial intelligence

Advanced features

Insights

What is the main focus of the study discussed in the text?

How does the study address the challenges in detecting botnet attacks, particularly in the absence of malicious traffic for training supervised learning models?

What semi-supervised learning techniques does the study explore for detecting botnet traffic?