Lexicon based feature extraction for emotion text classification.

Bandhakavi, Anil; Wiratunga, Nirmalie; Padmanabhan, Deepak; Massie, Stewart

doi:10.1016/j.patrec.2016.12.009

Lexicon based feature extraction for emotion text classification.

Bandhakavi, Anil; Wiratunga, Nirmalie; Padmanabhan, Deepak; Massie, Stewart

Authors

Anil Bandhakavi

Professor Nirmalie Wiratunga n.wiratunga@rgu.ac.uk
Associate Dean for Research

Deepak Padmanabhan

Dr Stewart Massie s.massie@rgu.ac.uk
Associate Professor

Abstract

General Purpose Emotion Lexicons (GPELs) that associate words with emotion categories remain a valuable resource for emotion analysis of text. However the static and formal nature of their vocabularies make them inadequate for extracting effective features for document representation, in domains that are inherently dynamic in nature (e.g. Social Media). This calls for lexicons that are not only adaptive to the lexical variations in a domain but also provide finer-grained quantitative estimates to accurately capture word-emotion associations. In this paper we extend prior work on domain specific emotion lexicon (DSEL) generation and apply it for emotion feature extraction. We demonstrate how our generative unigram mixture model (UMM) based DSEL learnt by harnessing labelled (blogs, news headlines and incident reports) and weakly-labelled (tweets) emotion text can be used to extract effective features for emotion classification. Our results confirm that the features derived using the proposed lexicon outperform those from state-of-the-art lexicons learnt using supervised Latent Dirichlet Allocation (sLDA) and Point-Wise Mutual Information (PMI). Further the proposed lexicon features also outperform state-of-the-art features derived using a combination of n-grams, part-of-speech information and sentiment lexicons.

Citation

BANDHAKAVI, A., WIRATUNGA, N., DEEPAK, P. and MASSIE, S. 2017. Lexicon based feature extraction for emotion text classification. Pattern recognition letters [online], 93, pages 133-142. Available from: https://doi.org/10.1016/j.patrec.2016.12.009

Journal Article Type	Article
Acceptance Date	Dec 14, 2016
Online Publication Date	Dec 15, 2016
Publication Date	Jul 1, 2017
Deposit Date	Dec 20, 2016
Publicly Available Date	Dec 16, 2017
Journal	Pattern recognition letters
Print ISSN	0167-8655
Electronic ISSN	1872-7344
Publisher	Elsevier
Peer Reviewed	Peer Reviewed
Volume	93
Pages	133-142
DOI	https://doi.org/10.1016/j.patrec.2016.12.009
Keywords	Emotion classification; Domain specific word emotion lexicons; Feature extraction
Public URL	http://hdl.handle.net/10059/2054
Contract Date	Dec 20, 2016