Mathew Nicho
Replacing human input in spam email detection using deep learning.
Nicho, Mathew; Majdani, Farzan; McDermott, Christopher D.
Authors
Contributors
Helmut Degen
Editor
Stavroula Ntoa
Editor
Abstract
The Covid-19 pandemic has been a driving force for a substantial increase in online activity and transactions across the globe. As a consequence, cyber-attacks, particularly those leveraging email as the preferred attack vector, have also increased exponentially since Q1 2020. Despite this, email remains a popular communication tool. Previously, in an effort to reduce the amount of spam entering a users inbox, many email providers started to incorporate spam filters into their products. However, many commercial spam filters rely on a human to train the filter, leaving a margin of risk if sufficient training has not occurred. In addition, knowing this, hackers employ more targeted and nuanced obfuscation methods to bypass in-built spam filters. In response to this continued problem, there is a growing body of research on the use of machine learning techniques for spam filtering. In many cases, detection results have shown great promise, but often still rely on human input to classify training datasets. In this study, we explore specifically the use of deep learning as a method of reducing human input required for spam detection. First, we evaluate the efficacy of popular spam detection methods/tools/techniques (freeware). Next, we narrow down machine learning techniques to select the appropriate method for our dataset. This was then compared with the accuracy of freeware spam detection tools to present our results. Our results showed that our deep learning model, based on simple word embedding and global max pooling (SWEM-max) had higher accuracy (98.41%) than both Thunderbird (95%) and Mailwasher (92%) which are based on Bayesian spam filtering. Finally, we postulate whether this improvement is enough to accept the removal of human input in spam email detection.
Citation
NICHO, M., MAJDANI, F. and MCDERMOTT, C.D. 2022. Replacing human input in spam email detection using deep learning. In Degen, H. and NTOA, S. (eds.) Artificial intelligence in HCI: proceedings of 3rd International conference on artificial intelligence in HCI (human-computer interaction) 2022 (AI-HCI 2022), co-located with the 24th International conference on human-computer interaction 2022 (HCI International 2022), 26 June - 1 July 2022, [virtual conference]. Lecture notes in artificial intelligence (LNAI), 13336. Cham: Springer [online], pages 387-404. Available from: https://doi.org/10.1007/978-3-031-05643-7_25
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 3rd International conference on artificial intelligence in HCI (human-computer interaction) 2022 (AI-HCI 2022), co-located with the 24th International conference on human-computer interaction 2022 (HCI International 2022) |
Start Date | Jun 26, 2022 |
End Date | Jul 1, 2022 |
Acceptance Date | Apr 25, 2022 |
Online Publication Date | May 15, 2022 |
Publication Date | Dec 31, 2022 |
Deposit Date | Jun 9, 2022 |
Publicly Available Date | May 16, 2023 |
Publisher | Springer |
Peer Reviewed | Peer Reviewed |
Volume | 13336 |
Pages | 387-404 |
Series Title | Lecture notes in artificial intelligence (LNAI) |
Series ISSN | 0302-9743 |
Book Title | Artificial intelligence in HCI |
ISBN | 9783031056420 |
DOI | https://doi.org/10.1007/978-3-031-05643-7_25 |
Keywords | Deep learning; Global max pooling; Phishing emails; Simple word embedding; Spam detection |
Public URL | https://rgu-repository.worktribe.com/output/1681939 |
Files
NICHO 2022 Replacing human input (AAM)
(413 Kb)
PDF
Copyright Statement
This version of the contribution has been accepted for publication, after peer review (when applicable) but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/978-3-031-05643-7_25. This accepted manuscript is subject to Springer Nature's AM terms of use.
You might also like
Detecting malicious signal manipulation in smart grids using intelligent analysis of contextual data.
(2020)
Presentation / Conference Contribution
A system dynamics approach to evaluate advanced persistent threat vectors.
(2023)
Journal Article
A crime scene reconstruction for digital forensic analysis: an SUV case study.
(2023)
Journal Article
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search