Dr Thanh Nguyen t.nguyen11@rgu.ac.uk
Senior Research Fellow
Dr Thanh Nguyen t.nguyen11@rgu.ac.uk
Senior Research Fellow
Manh Truong Dang
Anh Vu Luong
Alan Wee-Chung Liew
Tiancai Liang
Professor John McCall j.mccall@rgu.ac.uk
Interim Director
With the advancement of storage and processing technology, an enormous amount of data is collected on a daily basis in many applications. Nowadays, advanced data analytics have been used to mine the collected data for useful information and make predictions, contributing to the competitive advantages of companies. The increasing data volume, however, has posed many problems to classical batch learning systems, such as the need to retrain the model completely with the newly arrived samples or the impracticality of storing and accessing a large volume of data. This has prompted interest on incremental learning that operates on data streams. In this study, we develop an incremental online multi-label classification (OMLC) method based on a weighted clustering model. The model is made to adapt to the change of data via the decay mechanism in which each sample's weight dwindles away over time. The clustering model therefore always focuses more on newly arrived samples. In the classification process, only clusters whose weights are greater than a threshold (called mature clusters) are employed to assign labels for the samples. In our method, not only is the clustering model incrementally maintained with the revealed ground truth labels of the arrived samples, the number of predicted labels in a sample are also adjusted based on the Hoeffding inequality and the label cardinality. The experimental results show that our method is competitive compared to several well-known benchmark algorithms on six performance measures in both the stationary and the concept drift settings.
NGUYEN, T.T., DANG, M.T., LUONG, A.V., LIEW, A. W.-C., LIANG, T. and MCCALL, J. 2019. Multi-label classification via incremental clustering on an evolving data stream. Pattern recognition [online], 95, pages 96-113. Available from: https://doi.org/10.1016/j.patcog.2019.06.001
Journal Article Type | Article |
---|---|
Acceptance Date | Jun 1, 2019 |
Online Publication Date | Jun 3, 2019 |
Publication Date | Nov 30, 2019 |
Deposit Date | Jul 15, 2019 |
Publicly Available Date | Jun 4, 2020 |
Journal | Pattern Recognition |
Print ISSN | 0031-3203 |
Electronic ISSN | 1873-5142 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 95 |
Pages | 96-113 |
DOI | https://doi.org/10.1016/j.patcog.2019.06.001 |
Keywords | Multi-label classification; Incremental learning; Online learning; Clustering; Data stream; Concept drift |
Public URL | https://rgu-repository.worktribe.com/output/321502 |
NGUYEN 2019 Multi-label classification(incremental)
(1.1 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by-nc-nd/4.0/
Ensemble of deep learning models with surrogate-based optimization for medical image segmentation.
(2022)
Conference Proceeding
Facility location problem and permutation flow shop scheduling problem: a linked optimisation problem.
(2022)
Conference Proceeding
Towards explainable metaheuristics: PCA for trajectory mining in evolutionary algorithms.
(2021)
Conference Proceeding
Towards the landscape rotation as a perturbation strategy on the quadratic assignment problem.
(2021)
Conference Proceeding
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
Apache License Version 2.0 (http://www.apache.org/licenses/)
Apache License Version 2.0 (http://www.apache.org/licenses/)
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Advanced Search