Jérémie Clos
Representation and learning schemes for argument stance mining.
Clos, Jérémie
Authors
Contributors
Professor Nirmalie Wiratunga n.wiratunga@rgu.ac.uk
Supervisor
Dr Stewart Massie s.massie@rgu.ac.uk
Supervisor
Joemon Jose
Supervisor
Abstract
Argumentation is a key part of human interaction. Used introspectively, it searches for the truth, by laying down argument for and against positions. As a mediation tool, it can be used to search for compromise between multiple human agents. For this purpose, theories of argumentation have been in development since the Ancient Greeks in order to formalise the process and therefore remove the human imprecision from it. From this practice the process of argument mining has emerged. As human interaction has moved from the small scale of one-to-one (or few-to-few) debates to large scale discussions where tens of thousands of participants can express their opinion in real time, the importance of argument mining has grown while its feasibility in a manual annotation setting has diminished and relied mainly on a human-defined heuristics to process the data. This underlines the importance of a new generation of computational tools that can automate this process on a larger scale. In this thesis we study argument stance detection, one of the steps involved in the argument mining workflow. We demonstrate how we can use data of varying reliability in order to mine argument stance in social media data. We investigate a spectrum of techniques, from completely unsupervised classification of stance using a sentiment lexicon, automated computation of a regularised stance lexicon, automated computation of a lexicon with modifiers, and the use of a lexicon with modifiers as a temporal feature model for more complex classification algorithms. We find that the addition of contextual information enhances unsupervised stance classification, within reason, and that multi-strategy algorithms that combine multiple heuristics by ordering them from the precise to the general tend to outperform other approaches by a large margin. Focusing then on building a stance lexicon, we find that optimising such lexicons using an empirical risk minimisation framework allows us to regularise them to a higher degree than competing probabilistic techniques, which helps us learn better lexicons from noisy data. We also conclude that adding local context (neighbouring words) information during the learning phase of the lexicons tends to produce more accurate results at the cost of robustness, since part of the weights is distributed from the words with a class valence to the contextual words. Finally, when investigating the use of lexicons to build feature models for traditional machine learning techniques, simple lexicons (without context) seem to perform overall as well as more complex ones, and better than purely semantic representations. We also find that word-level feature models tend to outperform sentence and instance-level representations, but that they do not benefit as much from being augmented by lexicon knowledge.
This research programme was carried out in collaboration with the University of Glasgow, Department of Computer Science.
Citation
CLOS, J. 2019. Representation and learning schemes for argument stance mining. Robert Gordon University [online], PhD thesis. Available from: https://openair.rgu.ac.uk
Thesis Type | Thesis |
---|---|
Deposit Date | Oct 10, 2019 |
Publicly Available Date | Oct 10, 2019 |
Keywords | Sentiment analysis; Argument stance mining; Natural language processing; Social media; Lexicons |
Public URL | https://rgu-repository.worktribe.com/output/638077 |
Award Date | Jun 30, 2019 |
Files
CLOS 2019 Representation and learning
(2.6 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by-nc/4.0/
Copyright Statement
© The Author.
You might also like
iSee: a case-based reasoning platform for the design of explanation experiences.
(2024)
Journal Article
iSee: demonstration video. [video recording]
(2023)
Digital Artefact
Clinical dialogue transcription error correction using Seq2Seq models.
(2022)
Preprint / Working Paper
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search