Ontology alignment based on word embedding and random forest classification.

Nkisi-Orji, Ikechukwu; Wiratunga, Nirmalie; Massie, Stewart; Hui, Kit-Ying; Heaven, Rachel

doi:10.1007/978-3-030-10925-7_34

Ontology alignment based on word embedding and random forest classification.

Nkisi-Orji, Ikechukwu; Wiratunga, Nirmalie; Massie, Stewart; Hui, Kit-Ying; Heaven, Rachel

Authors

Dr Ikechukwu Nkisi-Orji i.nkisi-orji@rgu.ac.uk
Chancellor's Fellow

Professor Nirmalie Wiratunga n.wiratunga@rgu.ac.uk
Associate Dean for Research

Dr Stewart Massie s.massie@rgu.ac.uk
Associate Professor

Dr Kit-ying Hui k.hui@rgu.ac.uk
Lecturer

Rachel Heaven

Contributors

Michele Berlingerio
Editor

Francesco Bonchi
Editor

Thomas G�rtner
Editor

Neil Hurley
Editor

Georgiana Ifrim
Editor

Abstract

Ontology alignment is crucial for integrating heterogeneous data sources and forms an important component for realising the goals of the semantic web. Accordingly, several ontology alignment techniques have been proposed and used for discovering correspondences between the concepts (or entities) of different ontologies. However, these techniques mostly depend on string-based similarities which are unable to handle the vocabulary mismatch problem. Also, determining which similarity measures to use and how to effectively combine them in alignment systems are challenges that have persisted in this area. In this work, we introduce a random forest classifier approach for ontology alignment which relies on word embedding to discover semantic similarities between concepts. Specifically, we combine string-based and semantic similarity measures to form feature vectors that are used by the classifier model to determine when concepts match. By harnessing background knowledge and relying on minimal information from the ontologies, our approach can deal with knowledge-light ontological resources. It also eliminates the need for learning the aggregation weights of multiple similarity measures. Our experiments using Ontology Alignment Evaluation Initiative (OAEI) dataset and real-world ontologies highlight the utility of our approach and show that it can outperform state-of-the-art alignment systems.

Citation

NKISI-ORJI, I., WIRATUNGA, N., MASSIE, S., HUI, K.-Y. and HEAVEN, R. 2019. Ontology alignment based on word embedding and random forest classification. In Berlingerio, M., Bonchi, F., Gärtner, T., Hurley, N. and Ifrim, G. (eds.) Machine learning and knowledge discovery in databases: proceedings of the 2018 European conference on machine learning and principles and practice of knowledge discovery in databases (ECML PKDD 2018), 10-14 September 2018, Dublin, Ireland. Lecture notes in computer science, 11051. Cham: Springer [online], part I, pages 557-572. Available from: https://doi.org/10.1007/978-3-030-10925-7_34

Presentation Conference Type	Conference Paper (published)
Conference Name	2018 European conference on machine learning and principles and practice of knowledge discovery in databases (ECML PKDD 2018)
Start Date	Sep 10, 2018
End Date	Sep 14, 2018
Acceptance Date	Jun 14, 2018
Online Publication Date	Jan 18, 2019
Publication Date	Feb 15, 2019
Deposit Date	Jun 26, 2018
Publicly Available Date	Jan 19, 2020
Print ISSN	0302-9743
Electronic ISSN	1611-3349
Publisher	Springer
Peer Reviewed	Peer Reviewed
Pages	557-572
Series Title	Lecture notes in computer science
Series Number	11051
Series ISSN	1611-3349
Book Title	Machine learning and knowledge discovery in databases
ISBN	9783030109240
DOI	https://doi.org/10.1007/978-3-030-10925-7_34
Keywords	Ontology alignment; Word embedding; Machine classification; Semantic web
Public URL	http://hdl.handle.net/10059/2968
Contract Date	Jun 26, 2018