Anh Vu Luong
Heterogeneous ensemble selection for evolving data streams.
Luong, Anh Vu; Nguyen, Tien Thanh; Liew, Alan Wee-Chung; Wang, Shilin
Abstract
Ensemble learning has been widely applied to both batch data classification and streaming data classification. For the latter setting, most existing ensemble systems are homogenous, which means they are generated from only one type of learning model. In contrast, by combining several types of different learning models, a heterogeneous ensemble system can achieve greater diversity among its members, which helps to improve its performance. Although heterogeneous ensemble systems have achieved many successes in the batch classification setting, it is not trivial to extend them directly to the data stream setting. In this study, we propose a novel HEterogeneous Ensemble Selection (HEES) method, which dynamically selects an appropriate subset of base classifiers to predict data under the stream setting. We are inspired by the observation that a well-chosen subset of good base classifiers may outperform the whole ensemble system. Here, we define a good candidate as one that expresses not only high predictive performance but also high confidence in its prediction. Our selection process is thus divided into two sub-processes: accurate-candidate selection and confident-candidate selection. We define an accurate candidate in the stream context as a base classifier with high accuracy over the current concept, while a confident candidate as one with a confidence score higher than a certain threshold. In the first sub-process, we employ the prequential accuracy to estimate the performance of a base classifier at a specific time, while in the latter sub-process, we propose a new measure to quantify the predictive confidence and provide a method to learn the threshold incrementally. The final ensemble is formed by taking the intersection of the sets of confident classifiers and accurate classifiers. Experiments on a wide range of data streams show that the proposed method achieves competitive performance with lower running time in comparison to the state-of-the-art online ensemble methods.
Citation
LUONG, A.V., NGUYEN, T.T., LIEW, A.W.-C. and WANG, S. 2021. Heterogeneous ensemble selection for evolving data streams. Pattern recognition [online], 112, article ID 107743. Available from: https://doi.org/10.1016/j.patcog.2020.107743
Journal Article Type | Article |
---|---|
Acceptance Date | Oct 30, 2020 |
Online Publication Date | Nov 2, 2020 |
Publication Date | Apr 30, 2021 |
Deposit Date | Jan 16, 2021 |
Publicly Available Date | Nov 3, 2021 |
Journal | Pattern recognition |
Print ISSN | 0031-3203 |
Electronic ISSN | 1873-5142 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 112 |
Article Number | 107743 |
DOI | https://doi.org/10.1016/j.patcog.2020.107743 |
Keywords | Data streams; Heterogeneous ensembles; Ensemble selection |
Public URL | https://rgu-repository.worktribe.com/output/982146 |
Related Public URLs | https://rgu-repository.worktribe.com/output/1167901 |
Files
LUONG 2021 Heterogeneous ensemble (AAM)
(1.1 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by-nc-nd/4.0/
You might also like
Two-layer ensemble of deep learning models for medical image segmentation.
(2024)
Journal Article
DEFEG: deep ensemble with weighted feature generation.
(2023)
Journal Article
A comparative study of anomaly detection methods for gross error detection problems.
(2023)
Journal Article
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search