Akinola Ogunsemi
Ensemble-based relationship discovery in relational databases.
Ogunsemi, Akinola; McCall, John; Kern, Mathias; Lacroix, Benjamin; Corsar, David; Owusu, Gilbert
Authors
Professor John McCall j.mccall@rgu.ac.uk
Professorial Lead
Mathias Kern
Benjamin Lacroix
Dr David Corsar d.corsar1@rgu.ac.uk
Senior Lecturer
Gilbert Owusu
Contributors
Max Bramer
Editor
Richard Ellis
Editor
Abstract
We performed an investigation of how several data relationship discovery algorithms can be combined to improve performance. We investigated eight relationship discovery algorithms like Cosine similarity, Soundex similarity, Name similarity, Value range similarity, etc., to identify potential links between database tables in different ways using different categories of database information. We proposed voting system and hierarchical clustering ensemble methods to reduce the generalization error of each algorithm. Voting scheme uses a given weighting metric to combine the predictions of each algorithm. Hierarchical clustering groups predictions into clusters based on similarities and then combine a member from each cluster together. We run experiments to validate the performance of each algorithm and compare performance with our ensemble methods and the state-of-the-art algorithms (FaskFK, Randomness and HoPF) using Precision, Recall and F-Measure evaluation metrics over TPCH and AdvWork datasets. Results show that performance of each algorithm is limited, indicating the importance of combining them to consolidate their strengths.
Citation
OGUNSEMI, A., MCCALL, J., KERN, M., LACROIX, B., CORSAR, D. and OWUSU, G. 2020. Ensemble-based relationship discovery in relational databases. In Bramer, M. and Ellis, R. (eds.) Artificial intelligence XXXVII: proceedings of 40th British Computer Society's Specialist Group on Artificial Intelligence (SGAI) Artificial intelligence international conference 2020 (AI-2020), 15-17 December 2020, [virtual conference]. Lecture notes in artificial intelligence, 12498. Cham: Springer [online], pages 286-300. Available from: https://doi.org/10.1007/978-3-030-63799-6_22
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 40th British Computer Society's Specialist Group on Artificial Intelligence (SGAI) Artificial intelligence international conference 2020 (AI-2020) |
Start Date | Dec 15, 2020 |
End Date | Dec 17, 2020 |
Acceptance Date | Sep 3, 2020 |
Online Publication Date | Dec 8, 2020 |
Publication Date | Dec 31, 2020 |
Deposit Date | Jan 8, 2021 |
Publicly Available Date | Jan 8, 2021 |
Publisher | Springer |
Peer Reviewed | Peer Reviewed |
Volume | 12498 |
Pages | 286-300 |
Series Title | Lecture notes in artificial intelligence |
Series ISSN | 0302-9743 |
Book Title | Artificial intelligence XXXVII: proceedings of 40th SGAI Artificial intelligence international conference (AI 2020), 15-17 December 2020, Cambridge, UK |
ISBN | 9783030637989 |
DOI | https://doi.org/10.1007/978-3-030-63799-6_22 |
Keywords | Data discovery; Database management; Ensemble-based discovery; Primary/foreign key relationship; Semantic relationship |
Public URL | https://rgu-repository.worktribe.com/output/1085256 |
Files
OGUNSEMI 2020 Ensemble based (AAM)
(474 Kb)
PDF
You might also like
Mining potentially explanatory patterns via partial solutions.
(2024)
Presentation / Conference Contribution
A novel surrogate model for variable-length encoding and its application in optimising deep learning architecture.
(2024)
Presentation / Conference Contribution
Special issue on explainable AI in evolutionary computation.
(2024)
Journal Article
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search