Pattaramon Vuttipittayamongkol
Improved overlap-based undersampling for imbalanced dataset classification with application to epilepsy and Parkinson's disease.
Vuttipittayamongkol, Pattaramon; Elyan, Eyad
Abstract
Classification of imbalanced datasets has attracted substantial research interest over the past decades. Imbalanced datasets are common in several domains such as health, finance, security and others. A wide range of solutions to handle imbalanced datasets focus mainly on the class distribution problem and aim at providing more balanced datasets by means of resampling. However, existing literature shows that class overlap has a higher negative impact on the learning process than class distribution. In this paper, we propose overlap-based undersampling methods for maximizing the visibility of the minority class instances in the overlapping region. This is achieved by the use of soft clustering and the elimination threshold that is adaptable to the overlap degree to identify and eliminate negative instances in the overlapping region. For more accurate clustering and detection of overlapped negative instances, the presence of the minority class at the borderline areas is emphasized by means of oversampling. Extensive experiments using simulated and real-world datasets covering a wide range of imbalance and overlap scenarios including extreme cases were carried out. Results show signficant improvement in sensitivity and competitive performance with well-established and state-of-the-art methods.
Citation
VUTTIPITTAYAMONGKOL, P. and ELYAN, E. 2020. Improved overlap-based undersampling for imbalanced dataset classification with application to epilepsy and Parkinson's disease. International journal of neural systems [online], 30(8), article ID 2050043. Available from: https://doi.org/10.1142/S0129065720500434
Journal Article Type | Article |
---|---|
Acceptance Date | May 2, 2020 |
Online Publication Date | Jul 17, 2020 |
Publication Date | Aug 31, 2020 |
Deposit Date | Jun 30, 2020 |
Publicly Available Date | Jul 18, 2021 |
Journal | International journal of neural systems |
Print ISSN | 0129-0657 |
Electronic ISSN | 1793-6462 |
Publisher | World Scientific Publishing |
Peer Reviewed | Peer Reviewed |
Volume | 30 |
Issue | 8 |
Article Number | 2050043 |
DOI | https://doi.org/10.1142/S0129065720500434 |
Keywords | Class overlap; Imbalanced data; Undersampling; Classification; Adaptive threshold; Fuzzy C-means; Epilepsy; Parkinson's Disease |
Public URL | https://rgu-repository.worktribe.com/output/940589 |
Related Public URLs | https://rgu-repository.worktribe.com/output/969620 |
Files
VUTTIPITTAYAMONGKOL 2020 Improved overlap
(1.2 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by-nc/4.0/
You might also like
A data-driven decision support tool for offshore oil and gas decommissioning.
(2021)
Journal Article
On the class overlap problem in imbalanced data classification.
(2020)
Journal Article
Neighbourhood-based undersampling approach for handling imbalanced and overlapped data.
(2019)
Journal Article
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search