Data stream mining: methods and challenges for handling concept drift.
Wares, Scott; Isaacs, John; Elyan, Eyad
Doctor Eyad Elyan email@example.com
Mining and analysing streaming data is crucial for many applications, and this area of research has gained extensive attention over the past decade. However, there are several inherent problems that continue to challenge the hardware and the state-of-the art algorithmic solutions. Examples of such problems include the unbound size, varying speed and unknown data characteristics of arriving instances from a data stream. The aim of this research is to portray key challenges faced by algorithmic solutions for stream mining, particularly focusing on the prevalent issue of concept drift. A comprehensive discussion of concept drift and its inherent data challenges in the context of stream mining is presented, as is a critical, in-depth review of relevant literature. Current issues with the evaluative procedure for concept drift detectors is also explored, highlighting problems such as a lack of established base datasets and the impact of temporal dependence on concept drift detection. By exposing gaps in the current literature, this study suggests recommendations for future research which should aid in the progression of stream mining and concept drift detection algorithms.
|Journal Article Type||Article|
|Publication Date||Nov 30, 2019|
|Journal||SN Applied Sciences|
|Publisher||Springer (part of Springer Nature)|
|Peer Reviewed||Peer Reviewed|
|Institution Citation||WARES, S., ISAACS, J. and ELYAN, E. 2019. Data stream mining: methods and challenges for handling concept drift. SN applied sciences [online], 1(11), article ID 1412. Available from: https://doi.org/10.1007/s42452-019-1433-0|
|Keywords||Data streams; Data mining; Concept drift; Concept drift detection|
WARES 2019 Data stream mining
You might also like
Multiple fake classes GAN for data augmentation in face image dataset.
Digitisation of assets from the oil and gas industry: challenges and opportunities.
Neighbourhood-based undersampling approach for handling imbalanced and overlapped data.