Pattaramon Vuttipittayamongkol
Overlap-based undersampling for improving imbalanced data classification.
Vuttipittayamongkol, Pattaramon; Elyan, Eyad; Petrovski, Andrei; Jayne, Chrisina
Authors
Professor Eyad Elyan e.elyan@rgu.ac.uk
Professor
Dr Andrei Petrovski a.petrovski@rgu.ac.uk
Associate Professor
Chrisina Jayne
Contributors
Hujun Yin
Editor
David Camacho
Editor
Paulo Novais
Editor
Antonio J. Tall�n-Ballesteros
Editor
Abstract
Classification of imbalanced data remains an important field in machine learning. Several methods have been proposed to address the class imbalance problem including data resampling, adaptive learning and cost adjusting algorithms. Data resampling methods are widely used due to their simplicity and flexibility. Most existing resampling techniques aim at rebalancing class distribution. However, class imbalance is not the only factor that impacts the performance of the learning algorithm. Class overlap has proved to have a higher impact on the classification of imbalanced datasets than the dominance of the negative class. In this paper, we propose a new undersampling method that eliminates negative instances from the overlapping region and hence improves the visibility of the minority instances. Testing and evaluating the proposed method using 36 public imbalanced datasets showed statistically significant improvements in classification performance.
Citation
VUTTIPITTAYAMONGKOL, P., ELYAN, E., PETROVSKI, A. and JAYNE, C. 2018. Overlap-based undersampling for improving imbalanced data classification. In Yin, H., Camacho, D., Novais, P. and Tallón-Ballesteros, A. (eds.) Intelligent data engineering and automated learning: proceedings of the 19th International intelligent data engineering and automated learning conference (IDEAL 2018), 21-23 November 2018, Madrid, Spain. Lecture notes in computer science, 11341. Cham: Springer [online], pages 689-697. Available from: https://doi.org/10.1007/978-3-030-03493-1_72
Conference Name | 19th International intelligent data engineering and automated learning conference (IDEAL 2018) |
---|---|
Conference Location | Madrid, Spain |
Start Date | Nov 21, 2018 |
End Date | Nov 23, 2018 |
Acceptance Date | Aug 8, 2018 |
Online Publication Date | Nov 9, 2018 |
Publication Date | Dec 21, 2018 |
Deposit Date | Feb 8, 2019 |
Publicly Available Date | Nov 10, 2019 |
Publisher | Springer |
Pages | 689-697 |
Series Title | Lecture notes in computer science |
Series Number | 11314 |
Series ISSN | 0302-9743 |
ISBN | 9783030034924 |
DOI | https://doi.org/10.1007/978-3-030-03493-1_72 |
Keywords | Undersampling; Overlap; Imbalanced data; Classification; Fuzzy C-means; Resampling |
Public URL | http://hdl.handle.net/10059/3281 |
Files
VUTTIPITTAYAMONGKOL 2018 Overlap-based undersampling
(1.2 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by-nc/4.0/
You might also like
Unmasking the imposters: task-specific feature learning for face presentation attack detection.
(2023)
Conference Proceeding
On the UK smart metering system and value of data for distribution system operators.
(2023)
Conference Proceeding
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search