Scott Brian Wares
Towards handling temporal dependence in concept drift streams.
Wares, Scott Brian
Authors
Contributors
Dr John Isaacs j.p.isaacs@rgu.ac.uk
Supervisor
Professor Eyad Elyan e.elyan@rgu.ac.uk
Supervisor
Abstract
Modern technological advancements have led to the production of an incomprehensible amount of data from a wide array of devices. A constant supply of new data provides an invaluable opportunity for access to qualitative and quantitative insights. Organisations recognise that, in today's modern era, data provides a means of mitigating risk and loss whilst maximising effciency and profit. However, processing this data is not without its challenges. Much of this data is produced in an online environment. Realtime stream data is unbound in size, variety and velocity. Data may arrive complete or with missing attributes, and data availability and persistence is limited to a small window of time. Classification methods and techniques that process offline data are not applicable to online data streams. Instead, new online classification methods have been developed. Research concerning the problematic and prevalent issue of concept drift has produced a considerable number of methods that allow online classifiers to adapt to changes in the stream distribution. However, recent research suggests that the presence of temporal dependence can cause misleading evaluation when accuracy is used as the core metric. This thesis investigates temporal dependence and its negative effcts upon the classification of concept drift data. First, this thesis proposes a novel method for coping with temporal dependence during the classification of real-time data streams, where concept drift is present. Results indicate that a statistical based, selective resetting approach can reduce the impact of temporal dependence in concept drift streams without significant loss in predictive accuracy. Secondly, a new ensemble based method, KTUE, that adopts the Kappa-Temporal statistic for vote weighting is suggested. Results show that this method is capable of outperforming some state-of-the-art ensemble methods in both temporally dependent and non-temporally dependent environments. Finally, this research proposes a novel algorithm for the simulation of temporally dependent concept drift data, which aims to help address the lack of established datasets available for evaluation. Experimental results show that temporal dependence can be injected into fabricated data streams using existing generation methods.
Citation
WARES, S.B. 2023. Towards handling temporal dependence in concept drift streams. Robert Gordon University, PhD thesis. Hosted on OpenAIR [online]. Available from: https://doi.org/10.48526/rgu-wt-2271523
Thesis Type | Thesis |
---|---|
Deposit Date | Mar 14, 2024 |
Publicly Available Date | Mar 14, 2024 |
DOI | https://doi.org/10.48526/rgu-wt-2271523 |
Keywords | Data; Data streams; Data classification; Concept drift; Temporal dependence |
Public URL | https://rgu-repository.worktribe.com/output/2271523 |
Award Date | May 31, 2023 |
Files
WARES 2023 Towards handling temporal
(2.7 Mb)
PDF
Licence
https://creativecommons.org/licenses/by-nc/4.0/
Copyright Statement
© The Author.
You might also like
Burst detection-based selective classifier resetting.
(2021)
Journal Article
Data stream mining: methods and challenges for handling concept drift.
(2019)
Journal Article
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search