Skip to main content

Research Repository

Advanced Search

Heterogeneous ensemble selection for evolving data streams.

Luong, Anh Vu; Nguyen, Tien Thanh; Liew, Alan Wee-Chung; Wang, Shilin

Authors

Anh Vu Luong

Alan Wee-Chung Liew

Shilin Wang



Abstract

Ensemble learning has been widely applied to both batch data classification and streaming data classification. For the latter setting, most existing ensemble systems are homogenous, which means they are generated from only one type of learning model. In contrast, by combining several types of different learning models, a heterogeneous ensemble system can achieve greater diversity among its members, which helps to improve its performance. Although heterogeneous ensemble systems have achieved many successes in the batch classification setting, it is not trivial to extend them directly to the data stream setting. In this study, we propose a novel HEterogeneous Ensemble Selection (HEES) method, which dynamically selects an appropriate subset of base classifiers to predict data under the stream setting. We are inspired by the observation that a well-chosen subset of good base classifiers may outperform the whole ensemble system. Here, we define a good candidate as one that expresses not only high predictive performance but also high confidence in its prediction. Our selection process is thus divided into two sub-processes: accurate-candidate selection and confident-candidate selection. We define an accurate candidate in the stream context as a base classifier with high accuracy over the current concept, while a confident candidate as one with a confidence score higher than a certain threshold. In the first sub-process, we employ the prequential accuracy to estimate the performance of a base classifier at a specific time, while in the latter sub-process, we propose a new measure to quantify the predictive confidence and provide a method to learn the threshold incrementally. The final ensemble is formed by taking the intersection of the sets of confident classifiers and accurate classifiers. Experiments on a wide range of data streams show that the proposed method achieves competitive performance with lower running time in comparison to the state-of-the-art online ensemble methods.

Citation

LUONG, A.V., NGUYEN, T.T., LIEW, A.W.-C. and WANG, S. 2021. Heterogeneous ensemble selection for evolving data streams. Pattern recognition [online], 112, article ID 107743. Available from: https://doi.org/10.1016/j.patcog.2020.107743

Journal Article Type Article
Acceptance Date Oct 30, 2020
Online Publication Date Nov 2, 2020
Publication Date Apr 30, 2021
Deposit Date Jan 16, 2021
Publicly Available Date Nov 3, 2021
Journal Pattern Recognition
Print ISSN 0031-3203
Electronic ISSN 1873-5142
Publisher Elsevier
Peer Reviewed Peer Reviewed
Volume 112
Article Number 107743
DOI https://doi.org/10.1016/j.patcog.2020.107743
Keywords Data streams; Heterogeneous ensembles; Ensemble selection
Public URL https://rgu-repository.worktribe.com/output/982146
Related Public URLs https://rgu-repository.worktribe.com/output/1167901

Files

This file is under embargo until Nov 3, 2021 due to copyright reasons.

Contact publications@rgu.ac.uk to request a copy for personal use.




You might also like



Downloadable Citations