Ontology-driven information retrieval deals with the use of entities specified in domain ontologies to enhance search and browse. The entities or concepts of lightweight ontological resources are traditionally used to index resources in specialised domains. Indexing with concepts is often achieved manually and reusing them to enhance search remains a challenge. Other challenges range from the difficulty in merging multiple ontologies for use in retrieval to the problem of integrating concept-based search into existing search systems. We mainly encounter these challenges in enterprise search environments, which have not kept pace with Web search engines and mostly rely on full-text search systems. Full-text search systems are keyword-based and suffer from well-known vocabulary mismatch problems. Ontologies model domain knowledge and have the potential for use in understanding the unstructured content of documents. In this thesis, we investigate the challenges of using domain ontologies for enhancing search in enterprise systems. Firstly, we investigate methods for annotating documents by identifying the best concepts that represent their contents. We explore ways to overcome the challenges of insufficient textual features in lightweight ontologies and introduce an unsupervised method for annotating documents based on generating concept descriptors from external resources. Specifically, we augment concepts with descriptive textual content by exploiting the taxonomic structure of an ontology to ensure that we generate useful descriptors. Secondly, the need often arises for cross-ontology reasoning when using multiple ontologies in ontology-driven search. Once again, we attempt to overcome the absence of rich features in lightweight ontologies by exploring the use of background knowledge for the alignment process. We propose novel ontology alignment techniques which integrate string metrics, semantic features, and term weights for discovering diverse correspondence types in supervised and unsupervised ontology alignment. Thirdly, we investigate different representational schemes for queries and documents and explore semantic ranking models using conceptual representations. Accordingly, we propose a semantic ranking model that incorporates the knowledge of concept relatedness and a predictive model to apply semantic ranking only when it is deemed beneficial for retrieval. Finally, we conduct comprehensive evaluations of the proposed methods and discuss our findings.
NKISI-ORJI, I. 2019. Ontology driven information retrieval. Robert Gordon University [online], PhD thesis. Available from: https://openair.rgu.ac.uk