Dr Carlos Moreno-Garcia c.moreno-garcia@rgu.ac.uk
Associate Professor
Class-decomposition and augmentation for imbalanced data sentiment analysis.
Moreno-Garcia, Carlos Francisco; Jayne, Chrisina; Elyan, Eyad
Authors
Chrisina Jayne
Professor Eyad Elyan e.elyan@rgu.ac.uk
Professor
Abstract
Significant progress has been made in the area of text classification and natural language processing. However, like many other datasets from across different domains, text-based datasets may suffer from class-imbalance. This problem leads to model's bias toward the majority class instances. In this paper, we present a new approach to handle class-imbalance in text data by means of unsupervised learning algorithms. We present class-decomposition using two different unsupervised methods, namely k-means and Density-Based Spatial Clustering of Applications with Noise, applied to two different sentiment analysis data sets. The experimental results show that utilizing clustering to find within-class similarities can lead to significant improvement in learning algorithm's performances as well as reducing the dominance of the majority class instances without causing information loss.
Citation
MORENO-GARCIA, C.F., JAYNE, C. and ELYAN, E. 2021. Class-decomposition and augmentation for imbalanced data sentiment analysis. In Proceedings of 2021 International joint conference on neural networks (IJCNN 2021), 18-22 July 2021, [virtual conference]. Piscataway: IEEE [online], article 9533603. Available from: https://doi.org/10.1109/IJCNN52387.2021.9533603
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 2021 International joint conference on neural networks (IJCNN 2021) |
Start Date | Jul 18, 2021 |
End Date | Jul 22, 2021 |
Acceptance Date | Apr 10, 2021 |
Online Publication Date | Jul 22, 2021 |
Publication Date | Sep 20, 2021 |
Deposit Date | Sep 24, 2021 |
Publicly Available Date | Sep 24, 2021 |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Peer Reviewed | Peer Reviewed |
Series ISSN | 2161-4407 |
Book Title | Proceedings of 2021 Internationa joint confernce on neural networks (IJCNN 2021) |
ISBN | 9780738133669 |
DOI | https://doi.org/10.1109/ijcnn52387.2021.9533603 |
Keywords | Sentiment analysis; Text imbalanced datasets; Class decomposition |
Public URL | https://rgu-repository.worktribe.com/output/1465455 |
Files
MORENO-GARCIA 2021 Class-decomposition (AAM)
(441 Kb)
PDF
Copyright Statement
© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
Few-shot symbol detection in engineering drawings.
(2024)
Journal Article
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search