A Comparative Study of Machine Learning Models for HTTP Flood Attack Detection.

Abdu Seid; Srinivasan.T.R.; Mr. Fetulhak

dc.contributor.author	Abdu Seid
dc.contributor.author	Srinivasan.T.R.
dc.contributor.author	Mr. Fetulhak
dc.date.accessioned	2023-05-16T07:02:36Z
dc.date.available	2023-05-16T07:02:36Z
dc.date.issued	2023-02
dc.identifier.uri	https://repository.ju.edu.et//handle/123456789/8129
dc.description.abstract	Nowadays, almost every aspect of human life is impacted by the Internet. Incidents of cyberattacks and intrusions are therefore becoming regular news. Among many attack types, denial-of-service (DoS) attacks remain the most devastating and severe due to their potential impact. As we move down the tier, attacks at the application layer are particularly challenging to identify since they are stealthy by nature. HTTP flooding is an application layer attack that is extremely dangerous and damaging since it is simple to bring a targeted site or server down by flooding it with a large number of HTTP requests because the attacker uses seemingly-legitimate HTTP GET or POST requests to attack a web server or application. Machine learning and artificial intelligence research have exploded in recent years, offering new opportunities for intrusion detection solutions. However, data availability continues to greatly affect the success of such systems, as there is a scarcity of high-quality IDS datasets. This study introduces a solution that contributes to the detection of HTTP flood attacks using five machine learning approaches. The dataset is an important part of building machine learning-based IDS models. The process starts with generating a dataset. To generate normal http traffic, Selenium, a web browser automation tool, was used; to generate http flood attack traffic, tools such as slowhttptest and hoic were used. Meanwhile, Wireshark software is being used to capture network data and save it as a pcap file. Consequently, utilize CICflowmeter to convert the Pcap file to CSV file format. 84 features are extracted. Following the use of both manual and automatic feature selection, 30 features are selected and fed into the machine learning input for further experimentation. This study analyzes a machine learning-based HTTP flood attack detection system. Five supervised machine learning classifiers are evaluated: Random Forest (RF), Adaboost, Naive Bayes (NB), multi-layer perceptron (MLP), and long short-term memory (RNN-LSTM). Using seven classification performance evaluation metrics, namely accuracy, precision, recall, F-measure, false positive rate, false negative rate, and training time (sec). Upon completion of the experiment, the Random Forest algorithm produced superior results by applying the four classification metrics of accuracy, recall, f-measure, and false negative rate, with values of 98.30, 97.03, 97.98, and 2.96, respectively, using a test size of 20%. On the other hand, the Naive Bayes algorithm is comparatively the worst performer for the detection of HTTP flood attacks in this study. Moreover, even though the rank of estimators varies a little bit based on different metrics, using accuracy as a measure results were obtained when we ordered from best to worst: Random Forest, MLP, RNN LSTM, Adaboost, and finally Naive Bayes, with corresponding values of 98.30, 97.75, 97.70, 95.48, and 93.54, respectively.	en_US
dc.language.iso	en_US	en_US
dc.subject	HTTP flood attacks, Intrusion detection system, machine learning, classification metrics	en_US
dc.title	A Comparative Study of Machine Learning Models for HTTP Flood Attack Detection.	en_US
dc.type	Thesis	en_US