Representation and learning schemes for argument stance mining.
Professor Nirmalie Wiratunga email@example.com
Doctor Stewart Massie firstname.lastname@example.org
Argumentation is a key part of human interaction. Used introspectively, it searches for the truth, by laying down argument for and against positions. As a mediation tool, it can be used to search for compromise between multiple human agents. For this purpose, theories of argumentation have been in development since the Ancient Greeks in order to formalise the process and therefore remove the human imprecision from it. From this practice the process of argument mining has emerged. As human interaction has moved from the small scale of one-to-one (or few-to-few) debates to large scale discussions where tens of thousands of participants can express their opinion in real time, the importance of argument mining has grown while its feasibility in a manual annotation setting has diminished and relied mainly on a human-defined heuristics to process the data. This underlines the importance of a new generation of computational tools that can automate this process on a larger scale. In this thesis we study argument stance detection, one of the steps involved in the argument mining workflow. We demonstrate how we can use data of varying reliability in order to mine argument stance in social media data. We investigate a spectrum of techniques, from completely unsupervised classification of stance using a sentiment lexicon, automated computation of a regularised stance lexicon, automated computation of a lexicon with modifiers, and the use of a lexicon with modifiers as a temporal feature model for more complex classification algorithms. We find that the addition of contextual information enhances unsupervised stance classification, within reason, and that multi-strategy algorithms that combine multiple heuristics by ordering them from the precise to the general tend to outperform other approaches by a large margin. Focusing then on building a stance lexicon, we find that optimising such lexicons using an empirical risk minimisation framework allows us to regularise them to a higher degree than competing probabilistic techniques, which helps us learn better lexicons from noisy data. We also conclude that adding local context (neighbouring words) information during the learning phase of the lexicons tends to produce more accurate results at the cost of robustness, since part of the weights is distributed from the words with a class valence to the contextual words. Finally, when investigating the use of lexicons to build feature models for traditional machine learning techniques, simple lexicons (without context) seem to perform overall as well as more complex ones, and better than purely semantic representations. We also find that word-level feature models tend to outperform sentence and instance-level representations, but that they do not benefit as much from being augmented by lexicon knowledge.
This research programme was carried out in collaboration with the University of Glasgow, Department of Computer Science.
|Institution Citation||CLOS, J. 2019. Representation and learning schemes for argument stance mining. Robert Gordon University [online], PhD thesis. Available from: https://openair.rgu.ac.uk|
|Keywords||Sentiment analysis; Argument stance mining; Natural language processing; Social media; Lexicons|
CLOS 2019 Representation and learning
Copyright: the author and Robert Gordon University
You might also like
WEC: weighted ensemble of text classifiers.
Preface: case-based reasoning and deep learning.
Representing temporal dependencies in human activity recognition.
Ontology driven information retrieval.
Fall prediction using behavioural modelling from sensor data in smart homes.