Skip to main content

Research Repository

Advanced Search

Enhancing security assurance in software development: AI-based vulnerable code detection with static analysis.

Rajapaksha, Sampath; Senanayake, Janaka; Kalutarage, Harsha; Al-Kadri, Mhd Omar

Authors

Mhd Omar Al-Kadri



Contributors

Sokratis Katsikas
Editor

Abstract

The presence of vulnerable source code in software applications is causing significant reliability and security issues, which can be mitigated by integrating and assuring software security principles during the early stages of the development lifecycle. One promising approach to identifying vulnerabilities in source code is the use of Artificial Intelligence (AI). This research proposes an AI-based method for detecting source code vulnerabilities and leverages Explainable AI to help developers identify and understand vulnerable source code tokens. To train the model, a web crawler was used to collect a real-world dataset of 600,000 source code samples, which were annotated using static analysers. Several ML classifiers were tested on a feature vector generated using Natural Language Processing techniques. The Random Forest and Extreme Gradient Boosting classifiers were found to perform well in binary and multi-class approaches, respectively. The proposed model achieved a 0.96 F1-Score in binary classification and a 0.85 F1-Score in multi-class classification based on Common Weakness Enumeration (CWE) IDs. The model, trained on a dataset of actual source codes, is highly generalisable and has been integrated into a live web portal to validate its performance on real-world code vulnerabilities.

Citation

RAJAPAKSHA, S., SENANAYAKE, J., KALUTARAGE, H. and AL-KADRI, M.O. 2024. Enhancing security assurance in software development: AI-based vulnerable code detection with static analysis. In Katsikas, S. et al. (eds.) Computer security: revised selected papers from the proceedings of the International workshops of the 28th European symposium on research in computer security (ESORICS 2023 International Workshops), 25-29 September 2023, The Hague, Netherlands. Lecture notes in computer science, 14399. Cham: Springer [online], part II, pages 341-356. Available from: https://doi.org/10.1007/978-3-031-54129-2_20

Conference Name International workshops of the 28th European symposium on research in computer security (ESORICS 2023 International Workshops)
Conference Location The Hague, Netherlands
Start Date Sep 25, 2023
End Date Sep 29, 2023
Acceptance Date Aug 14, 2023
Online Publication Date Mar 12, 2024
Publication Date Dec 31, 2024
Deposit Date Apr 26, 2024
Publicly Available Date Mar 13, 2025
Publisher Springer
Pages 341-356
Series Title Lecture notes in computer science
Series Number 14399
Series ISSN 0302-9743; 1611-3349
Book Title Computer security: revised selected papers from the proceedings of the International workshops of the 28th European symposium on research in computer security (ESORICS 2023 International Workshops), part II
ISBN 9783031541285
DOI https://doi.org/10.1007/978-3-031-54129-2_20
Keywords Source code vulnerability; Artificial intelligence; Software security; Vulnerability scanners
Public URL https://rgu-repository.worktribe.com/output/2271880