Anh Vu Luong
Heterogeneous ensemble selection for evolving data streams.
Luong, Anh Vu; Nguyen, Tien Thanh; Liew, Alan Wee-Chung; Wang, Shilin
Dr Thanh Nguyen email@example.com
Senior Research Fellow
Alan Wee-Chung Liew
Ensemble learning has been widely applied to both batch data classification and streaming data classification. For the latter setting, most existing ensemble systems are homogenous, which means they are generated from only one type of learning model. In contrast, by combining several types of different learning models, a heterogeneous ensemble system can achieve greater diversity among its members, which helps to improve its performance. Although heterogeneous ensemble systems have achieved many successes in the batch classification setting, it is not trivial to extend them directly to the data stream setting. In this study, we propose a novel HEterogeneous Ensemble Selection (HEES) method, which dynamically selects an appropriate subset of base classifiers to predict data under the stream setting. We are inspired by the observation that a well-chosen subset of good base classifiers may outperform the whole ensemble system. Here, we define a good candidate as one that expresses not only high predictive performance but also high confidence in its prediction. Our selection process is thus divided into two sub-processes: accurate-candidate selection and confident-candidate selection. We define an accurate candidate in the stream context as a base classifier with high accuracy over the current concept, while a confident candidate as one with a confidence score higher than a certain threshold. In the first sub-process, we employ the prequential accuracy to estimate the performance of a base classifier at a specific time, while in the latter sub-process, we propose a new measure to quantify the predictive confidence and provide a method to learn the threshold incrementally. The final ensemble is formed by taking the intersection of the sets of confident classifiers and accurate classifiers. Experiments on a wide range of data streams show that the proposed method achieves competitive performance with lower running time in comparison to the state-of-the-art online ensemble methods.
LUONG, A.V., NGUYEN, T.T., LIEW, A.W.-C. and WANG, S. 2021. Heterogeneous ensemble selection for evolving data streams. Pattern recognition [online], 112, article ID 107743. Available from: https://doi.org/10.1016/j.patcog.2020.107743
|Journal Article Type||Article|
|Acceptance Date||Oct 30, 2020|
|Online Publication Date||Nov 2, 2020|
|Publication Date||Apr 30, 2021|
|Deposit Date||Jan 16, 2021|
|Publicly Available Date||Nov 3, 2021|
|Peer Reviewed||Peer Reviewed|
|Keywords||Data streams; Heterogeneous ensembles; Ensemble selection|
|Related Public URLs||https://rgu-repository.worktribe.com/output/1167901|
LUONG 2021 Heterogeneous ensemble (AAM)
Publisher Licence URL
You might also like
A comparative study of anomaly detection methods for gross error detection problems.
Ensemble of deep learning models with surrogate-based optimization for medical image segmentation.
Streaming multi-layer ensemble selection using dynamic genetic algorithm.