Skip to main content

Research Repository

Advanced Search

SCaLe-QA: Sri Lankan case law embeddings for legal QA.

Jayawardena, Lasal; Wiratunga, Nirmalie; Abeyratne, Ramitha; Martin, Kyle; Nkisi-Orji, Ikechukwu; Weerasinghe, Ruvan

Authors

Lasal Jayawardena

Ruvan Weerasinghe



Contributors

Abstract

SCaLe-QA is a foundational system developed for Sri Lankan Legal Question Answering (LQA) by leveraging domain-specific embeddings derived from Supreme Court cases. The system is tailored to capture the unique linguistic and structural characteristics of Sri Lankan law through fine-tuned embeddings. While Case-Based Reasoning (CBR) will be integrated into the question-answering framework, it is primarily set for future development and evaluation. Currently, SCaLe-QA employs semantic chunking, tokenization, and BM25-based ranking to generate context-driven triplets from unlabeled corpora. In addition, an angle-optimised contrastive learning framework is applied to enhance retrieval accuracy. Preliminary results indicate promise, establishing SCaLe-QA as a significant step toward robust AI applications in the Sri Lankan legal domain.

Citation

JAYAWARDENA, L., WIRATUNGA, N., ABEYRATNE, R., MARTIN, K., NKISI-ORJI, I. and WEERASINGHE, R. 2024. SCaLe-QU: Sri Lankan case law embeddings for legal QA. In Martin, K., Salimi, P. and Wijayasekara, V. (eds.) 2024. SICSA REALLM workshop 2024: proceedings of the SICSA (Scottish Informatics and Computer Science Alliance) REALLM (Reasoning, explanation and applications of large language models) workshop (SICSA REALLM workshop 2024), 17 October 2024, Aberdeen, UK. CEUR workshop proceedings, 3822. Aachen: CEUR-WS [online], pages 47-55. Available from: https://ceur-ws.org/Vol-3822/short6.pdf

Presentation Conference Type Conference Paper (published)
Conference Name 2024 SICSA (Scottish Informatics and Computer Science Alliance) REALLM (Reasoning, explanation and applications of large language models) workshop (SICSA REALLM workshop 2024)
Start Date Oct 17, 2024
Acceptance Date Oct 1, 2024
Online Publication Date Oct 17, 2024
Publication Date Nov 4, 2024
Deposit Date Dec 5, 2024
Publicly Available Date Dec 5, 2024
Publisher CEUR-WS
Peer Reviewed Peer Reviewed
Pages 47-55
Series Title CEUR-workshop proceedings
Series Number 3822
Series ISSN 1613-0073
Keywords Text embeddings; Legal AI; RAG; CBR; Legal question answering; Retrieval
Public URL https://rgu-repository.worktribe.com/output/2613537
Publisher URL https://ceur-ws.org/Vol-3822/

Files




You might also like



Downloadable Citations