Analysis of the False Prediction of the Logistic Regression Algorithm in SQL Payload Classification and its Impact on the Principles of Information Security (CIA)

Authors

DOI:

https://doi.org/10.52866/%20ijcsm.2023.04.04.015

Keywords:

SQL injection, Logistic Regression, SVD, Confidentiality, Integrity, Availability

Abstract

Securing sensitive data is of utmost importance for organizations and individuals to maintain data confidentiality, integrity, and timely availability. Additionally, the data of organizations and individuals may face increased risks due to attackers gaining unauthorized access, enabling them to misuse the data for illicit purposes. The consequences of such attacks can be severe, leading to significant financial losses and a breakdown of trust between individuals and organizations. Structured Query Language injection (SQL-i) stands out as one of the most prevalent methods employed to illicitly access data, exploiting a vulnerability in the query statement. This vulnerability grants an attacker swift and effortless access to the data. It consequently allows an unauthorized user to tamper with or erase data, or even hinder legitimate access to it. To counteract these attacks, this research aims to build a model using Machine-Learning (ML) techniques that classifies the type of payloads sent by users. This model aims to reduce the time required for payload classification and to scrutinize the false predictions when classifying SQL requests, along with their ramifications on principles of information security (Confidentiality, Integrity and Availability - CIA). The devised model incorporates a dataset containing harmful and benign payloads to train it, employing Logistic Regression (LR) and Singular Value Decomposition (SVD) techniques. The model demonstrated an impressive accuracy of 98.20%, precision of 98.02%, recall of 99.65% and an F1 score of 98.20%. Furthermore, the time taken to classify a payload was a mere 0.0029 seconds. The constructed model excels in accurately categorizing payloads and significantly reducing processing time, owing to the utilization of the LR model in tandem with SVD, which aids in selecting the most pertinent features for training the model.

Downloads

Download data is not yet available.

Downloads

Published

2023-11-19

How to Cite

[1]
Rehab Flaih Hasan, O. S. F. Shareef, and Ammar Hatem Farhan, “Analysis of the False Prediction of the Logistic Regression Algorithm in SQL Payload Classification and its Impact on the Principles of Information Security (CIA)”, Iraqi Journal For Computer Science and Mathematics, vol. 4, no. 4, pp. 191–203, Nov. 2023.
CITATION
DOI: 10.52866/ ijcsm.2023.04.04.015
Published: 2023-11-19

Issue

Section

Articles