Skip to main content

Research Repository

Advanced Search

Class-decomposition and augmentation for imbalanced data sentiment analysis.

Moreno-Garcia, Carlos Francisco; Jayne, Chrisina; Elyan, Eyad

Authors

Chrisina Jayne



Abstract

Significant progress has been made in the area of text classification and natural language processing. However, like many other datasets from across different domains, text-based datasets may suffer from class-imbalance. This problem leads to model's bias toward the majority class instances. In this paper, we present a new approach to handle class-imbalance in text data by means of unsupervised learning algorithms. We present class-decomposition using two different unsupervised methods, namely k-means and Density-Based Spatial Clustering of Applications with Noise, applied to two different sentiment analysis data sets. The experimental results show that utilizing clustering to find within-class similarities can lead to significant improvement in learning algorithm's performances as well as reducing the dominance of the majority class instances without causing information loss.

Citation

MORENO-GARCIA, C.F., JAYNE, C. and ELYAN, E. 2021. Class-decomposition and augmentation for imbalanced data sentiment analysis. In Proceedings of 2021 International joint conference on neural networks (IJCNN 2021), 18-22 July 2021, [virtual conference]. Piscataway: IEEE [online], article 9533603. Available from: https://doi.org/10.1109/IJCNN52387.2021.9533603

Presentation Conference Type Conference Paper (published)
Conference Name 2021 International joint conference on neural networks (IJCNN 2021)
Start Date Jul 18, 2021
End Date Jul 22, 2021
Acceptance Date Apr 10, 2021
Online Publication Date Jul 22, 2021
Publication Date Sep 20, 2021
Deposit Date Sep 24, 2021
Publicly Available Date Sep 24, 2021
Publisher Institute of Electrical and Electronics Engineers (IEEE)
Peer Reviewed Peer Reviewed
Series ISSN 2161-4407
Book Title Proceedings of 2021 Internationa joint confernce on neural networks (IJCNN 2021)
ISBN 9780738133669
DOI https://doi.org/10.1109/ijcnn52387.2021.9533603
Keywords Sentiment analysis; Text imbalanced datasets; Class decomposition
Public URL https://rgu-repository.worktribe.com/output/1465455

Files

MORENO-GARCIA 2021 Class-decomposition (AAM) (441 Kb)
PDF

Copyright Statement
© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.






You might also like



Downloadable Citations