Pattaramon Vuttipittayamongkol
Overlap-based undersampling method for classification of imbalanced medical datasets.
Vuttipittayamongkol, Pattaramon; Elyan, Eyad
Authors
Professor Eyad Elyan e.elyan@rgu.ac.uk
Professor
Contributors
Ilias Maglogiannis
Editor
Lazaros Iliadis
Editor
Elias Pimenidis
Editor
Abstract
Early diagnosis of some life-threatening diseases such as cancers and heart is crucial for effective treatments. Supervised machine learning has proved to be a very useful tool to serve this purpose. Historical data of patients including clinical and demographic information is used for training learning algorithms. This builds predictive models that provide initial diagnoses. However, in the medical domain, it is common to have the positive class under-represented in a dataset. In such a scenario, a typical learning algorithm tends to be biased towards the negative class, which is the majority class, and misclassify positive cases. This is known as the class imbalance problem. In this paper, a framework for predictive diagnostics of diseases with imbalanced records is presented. To reduce the classification bias, we propose the usage of an overlap-based undersampling method to improve the visibility of minority class samples in the region where the two classes overlap. This is achieved by detecting and removing negative class instances from the overlapping region. This will improve class separability in the data space. Experimental results show achievement of high accuracy in the positive class, which is highly preferable in the medical domain, while good trade-offs between sensitivity and specificity were obtained. Results also show that the method often outperformed other state-of-the-art and well-established techniques.
Citation
VUTTIPITTAYAMONGKOL, P. and ELYAN, E. 2020. Overlap-based undersampling method for classification of imbalanced medical datasets. In Maglogiannis, I., Iliadis, L. and Pimenidis, E. (eds.) Artificial intelligence applications and innovations: AIAI 2020; proceedings of 16th International Federation for Information Processing working group (IFIP WG) 12.5 International artificial intelligence applications and innovations, 5-7 June 2020, Halkidiki, Greece. IFIP advances in information and communication technology, 584. Cham: Springer [online], pages 358-369. Available from: https://doi.org/10.1007/978-3-030-49186-4_30
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 16th International artificial intelligence applications and innovations (AIAI 2020) |
Start Date | Jun 5, 2020 |
End Date | Jun 7, 2020 |
Acceptance Date | Mar 29, 2020 |
Online Publication Date | May 29, 2020 |
Publication Date | Dec 31, 2020 |
Deposit Date | Jun 29, 2020 |
Publicly Available Date | Jun 29, 2020 |
Publisher | Springer |
Peer Reviewed | Peer Reviewed |
Volume | 584 |
Pages | 358-369 |
Series Title | IFIP advances in information and communication technology |
Series Number | 6102 |
Series ISSN | 1868-4238 |
Book Title | Artificial intelligence applications and innovations: AIAI 2020 |
ISBN | 9783030491857 |
DOI | https://doi.org/10.1007/978-3-030-49186-4_30 |
Keywords | Imbalanced data; Medical diagnosis; Medical prediction; Class overlap; Classification; Undersampling; Nearest neighbour; Machine learning |
Public URL | https://rgu-repository.worktribe.com/output/937472 |
Files
VUTTIPITTAYAMONGKOL 2020 Overlap-based (AAM)
(467 Kb)
PDF
You might also like
A multimodel-based screening framework for C-19 using deep learning-inspired data fusion.
(2024)
Journal Article
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search