Dr Ikechukwu Nkisi-Orji i.nkisi-orji@rgu.ac.uk
Research Fellow (B)
Dr Ikechukwu Nkisi-Orji i.nkisi-orji@rgu.ac.uk
Research Fellow (B)
Professor Nirmalie Wiratunga n.wiratunga@rgu.ac.uk
Professor
Dr Stewart Massie s.massie@rgu.ac.uk
Reader
Dr Kit-ying Hui k.hui@rgu.ac.uk
Lecturer
Rachel Heaven
Michele Berlingerio
Editor
Francesco Bonchi
Editor
Thomas
Editor
Neil Hurley
Editor
Georgiana Ifrim
Editor
Ontology alignment is crucial for integrating heterogeneous data sources and forms an important component for realising the goals of the semantic web. Accordingly, several ontology alignment techniques have been proposed and used for discovering correspondences between the concepts (or entities) of different ontologies. However, these techniques mostly depend on string-based similarities which are unable to handle the vocabulary mismatch problem. Also, determining which similarity measures to use and how to effectively combine them in alignment systems are challenges that have persisted in this area. In this work, we introduce a random forest classifier approach for ontology alignment which relies on word embedding to discover semantic similarities between concepts. Specifically, we combine string-based and semantic similarity measures to form feature vectors that are used by the classifier model to determine when concepts match. By harnessing background knowledge and relying on minimal information from the ontologies, our approach can deal with knowledge-light ontological resources. It also eliminates the need for learning the aggregation weights of multiple similarity measures. Our experiments using Ontology Alignment Evaluation Initiative (OAEI) dataset and real-world ontologies highlight the utility of our approach and show that it can outperform state-of-the-art alignment systems.
NKISI-ORJI, I., WIRATUNGA, N., MASSIE, S., HUI, K.-Y. and HEAVEN, R. 2019. Ontology alignment based on word embedding and random forest classification. In Berlingerio, M., Bonchi, F., Gärtner, T., Hurley, N. and Ifrim, G. (eds.) Machine learning and knowledge discovery in databases: proceedings of the 2018 European conference on machine learning and principles and practice of knowledge discovery in databases (ECML PKDD 2018), 10-14 September 2018, Dublin, Ireland. Lecture notes in computer science, 11051. Cham: Springer [online], part I, pages 557-572. Available from: https://doi.org/10.1007/978-3-030-10925-7_34
Conference Name | 2018 European conference on machine learning and principles and practice of knowledge discovery in databases (ECML PKDD 2018) |
---|---|
Conference Location | Dublin, Ireland |
Start Date | Sep 10, 2018 |
End Date | Sep 14, 2018 |
Acceptance Date | Jun 14, 2018 |
Online Publication Date | Jan 18, 2019 |
Publication Date | Feb 15, 2019 |
Deposit Date | Jun 26, 2018 |
Publicly Available Date | Jan 18, 2019 |
Print ISSN | 0302-9743 |
Electronic ISSN | 1611-3349 |
Publisher | Springer |
Pages | 557-572 |
Series Title | Lecture notes in computer science |
Series Number | 11051 |
Series ISSN | 1611-3349 |
Book Title | Machine learning and knowledge discovery in databases |
ISBN | 9783030109240 |
DOI | https://doi.org/10.1007/978-3-030-10925-7_34 |
Keywords | Ontology alignment; Word embedding; Machine classification; Semantic web |
Public URL | http://hdl.handle.net/10059/2968 |
NKISI-ORJI 2018 Ontology alignment
(671 Kb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by-nc/4.0/
Ontology driven information retrieval.
(2019)
Thesis
Taxonomic corpus-based concept summary generation for document annotation.
(2017)
Conference Proceeding
Multi-HDCS: solving DisCSPs with complex local problems cooperatively.
(2010)
Conference Proceeding
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
Apache License Version 2.0 (http://www.apache.org/licenses/)
Apache License Version 2.0 (http://www.apache.org/licenses/)
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Advanced Search