Anil Bandhakavi
Domain-specific lexicon generation for emotion detection from text.
Bandhakavi, Anil
Authors
Contributors
Professor Nirmalie Wiratunga n.wiratunga@rgu.ac.uk
Supervisor
Dr Stewart Massie s.massie@rgu.ac.uk
Supervisor
Deepak Padmanabhan
Supervisor
Abstract
Emotions play a key role in effective and successful human communication. Text is popularly used on the internet and social media websites to express and share emotions, feelings and sentiments. However useful applications and services built to understand emotions from text are limited in effectiveness due to reliance on general purpose emotion lexicons that have static vocabulary and sentiment lexicons that can only interpret emotions coarsely. Thus emotion detection from text calls for methods and knowledge resources that can deal with challenges such as dynamic and informal vocabulary, domain-level variations in emotional expressions and other linguistic nuances. In this thesis we demonstrate how labelled (e.g. blogs, news headlines) and weakly-labelled (e.g. tweets) emotional documents can be harnessed to learn word-emotion lexicons that can account for dynamic and domain-specific emotional vocabulary. We model the characteristics of realworld emotional documents to propose a generative mixture model, which iteratively estimates the language models that best describe the emotional documents using expectation maximization (EM). The proposed mixture model has the ability to model both emotionally charged words and emotion-neutral words. We then generate a word-emotion lexicon using the mixture model to quantify word-emotion associations in the form of a probability vectors. Secondly we introduce novel feature extraction methods to utilize the emotion rich knowledge being captured by our word-emotion lexicon. The extracted features are used to classify text into emotion classes using machine learning. Further we also propose hybrid text representations for emotion classification that use the knowledge of lexicon based features in conjunction with other representations such as n-grams, part-of-speech and sentiment information. Thirdly we propose two different methods which jointly use an emotion-labelled corpus of tweets and emotion-sentiment mapping proposed in psychology to learn word-level numerical quantification of sentiment strengths over a positive to negative spectrum. Finally we evaluate all the proposed methods in this thesis through a variety of emotion detection and sentiment analysis tasks on benchmark data sets covering domains from blogs to news articles to tweets and incident reports.
Citation
BANDHAKAVI, A. 2018. Domain-specific lexicon generation for emotion detection from text. Robert Gordon University, PhD thesis.
Thesis Type | Thesis |
---|---|
Deposit Date | Aug 31, 2018 |
Publicly Available Date | Aug 31, 2018 |
Keywords | Emotion; Lexicons; Emotion detection; Text; Word emotion lexicon; Emotion labelled corpus; Emotion sentiment mapping; Machine learning |
Public URL | http://hdl.handle.net/10059/3103 |
Contract Date | Aug 31, 2018 |
Award Date | Jan 31, 2018 |
Files
BANDHAKAVI 2018 Domain-specific lexicon
(4.1 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by-nc/4.0/
Copyright Statement
© The Author.
You might also like
Opinion context extraction for aspect sentiment analysis.
(2018)
Presentation / Conference Contribution
Context extraction for aspect-based sentiment analytics: combining syntactic, lexical and sentiment knowledge.
(2018)
Presentation / Conference Contribution
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search