Dr Ikechukwu Nkisi-Orji i.nkisi-orji@rgu.ac.uk
Chancellor's Fellow
Taxonomic corpus-based concept summary generation for document annotation.
Nkisi-Orji, Ikechukwu; Wiratunga, Nirmalie; Hui, Kit-Ying; Heaven, Rachel; Massie, Stewart
Authors
Professor Nirmalie Wiratunga n.wiratunga@rgu.ac.uk
Associate Dean for Research
Dr Kit-ying Hui k.hui@rgu.ac.uk
Lecturer
Rachel Heaven
Dr Stewart Massie s.massie@rgu.ac.uk
Associate Professor
Contributors
Jaap Kamps
Editor
Giannis Tsakonas
Editor
Yannis Manolopoulos
Editor
Lazaros Iliadis
Editor
Ioannis Karydis
Editor
Abstract
Semantic annotation is an enabling technology which links documents to concepts that unambiguously describe their content. Annotation improves access to document contents for both humans and software agents. However, the annotation process is a challenging task as annotators often have to select from thousands of potentially relevant concepts from controlled vocabularies. The best approaches to assist in this task rely on reusing the annotations of an annotated corpus. In the absence of a pre-annotated corpus, alternative approaches suffer due to insufficient descriptive texts for concepts in most vocabularies. In this paper, we propose an unsupervised method for recommending document annotations based on generating node descriptors from an external corpus. We exploit knowledge of the taxonomic structure of a thesaurus to ensure that effective descriptors (concept summaries) are generated for concepts. Our evaluation on recommending annotations show that the content that we generate effectively represents the concepts. Also, our approach outperforms those which rely on information from a thesaurus alone and is comparable with supervised approaches.
Citation
NKISI-ORJI, I., WIRATUNGA, N., HUI, K.-Y., HEAVEN, R. and MASSIE, S. 2017. Taxonomic corpus-based concept summary generation for document annotation. In Kampus, J., Tsakonas, G., Manolopoulos, Y., Iliadis, L. and Karydis, I. (eds.) Proceedings of the 21st International conference on theory and practice of digital libraries (TPDL 2017): research and advanced technology for digital libraries, 18-21 September 2017, Thessaloniki, Greece. Lecture notes in computer science, 10450. Cham: Springer [online], pages 49-60. Available from: https://doi.org/10.1007/978-3-319-67008-9_5
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 21st International conference on theory and practice of digital libraries (TPDL 2017) |
Start Date | Sep 18, 2017 |
End Date | Sep 21, 2017 |
Acceptance Date | May 26, 2017 |
Online Publication Date | Sep 2, 2017 |
Publication Date | Sep 30, 2017 |
Deposit Date | Nov 10, 2017 |
Publicly Available Date | Sep 3, 2018 |
Print ISSN | 0302-9743 |
Publisher | Springer |
Peer Reviewed | Peer Reviewed |
Pages | 49-60 |
Series Title | Lecture notes in computer science |
Series Number | 10450 |
Series ISSN | 0302-9743 |
ISBN | 9783319670072 |
DOI | https://doi.org/10.1007/978-3-319-67008-9_5 |
Keywords | Taxonomy; Text annotation; Information discovery |
Public URL | http://hdl.handle.net/10059/2586 |
Contract Date | Nov 10, 2017 |
Files
NKISI-ORJI 2017 Taxonomic corpus-based concept summaries
(978 Kb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by-nc/4.0/
You might also like
Ontology alignment based on word embedding and random forest classification.
(2019)
Presentation / Conference Contribution
Clood CBR: towards microservices oriented case-based reasoning.
(2020)
Presentation / Conference Contribution
Counterfactual explanations for student outcome prediction with Moodle footprints.
(2021)
Presentation / Conference Contribution
DisCERN: discovering counterfactual explanations using relevance features from neighbourhoods.
(2021)
Presentation / Conference Contribution
Actionable feature discovery in counterfactuals using feature relevance explainers.
(2021)
Presentation / Conference Contribution
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search