Lasal Jayawardena
SCaLe-QA: Sri Lankan case law embeddings for legal QA.
Jayawardena, Lasal; Wiratunga, Nirmalie; Abeyratne, Ramitha; Martin, Kyle; Nkisi-Orji, Ikechukwu; Weerasinghe, Ruvan
Authors
Professor Nirmalie Wiratunga n.wiratunga@rgu.ac.uk
Associate Dean for Research
RAMITHA ABEYRATNE r.abeyratne@rgu.ac.uk
Research Student
Dr Kyle Martin k.martin3@rgu.ac.uk
Lecturer
Dr Ikechukwu Nkisi-Orji i.nkisi-orji@rgu.ac.uk
Chancellor's Fellow
Ruvan Weerasinghe
Contributors
Dr Kyle Martin k.martin3@rgu.ac.uk
Editor
PEDRAM SALIMI p.salimi@rgu.ac.uk
Editor
Mr Vihanga Wijayasekara v.wijayasekara@rgu.ac.uk
Editor
Abstract
SCaLe-QA is a foundational system developed for Sri Lankan Legal Question Answering (LQA) by leveraging domain-specific embeddings derived from Supreme Court cases. The system is tailored to capture the unique linguistic and structural characteristics of Sri Lankan law through fine-tuned embeddings. While Case-Based Reasoning (CBR) will be integrated into the question-answering framework, it is primarily set for future development and evaluation. Currently, SCaLe-QA employs semantic chunking, tokenization, and BM25-based ranking to generate context-driven triplets from unlabeled corpora. In addition, an angle-optimised contrastive learning framework is applied to enhance retrieval accuracy. Preliminary results indicate promise, establishing SCaLe-QA as a significant step toward robust AI applications in the Sri Lankan legal domain.
Citation
JAYAWARDENA, L., WIRATUNGA, N., ABEYRATNE, R., MARTIN, K., NKISI-ORJI, I. and WEERASINGHE, R. 2024. SCaLe-QU: Sri Lankan case law embeddings for legal QA. In Martin, K., Salimi, P. and Wijayasekara, V. (eds.) 2024. SICSA REALLM workshop 2024: proceedings of the SICSA (Scottish Informatics and Computer Science Alliance) REALLM (Reasoning, explanation and applications of large language models) workshop (SICSA REALLM workshop 2024), 17 October 2024, Aberdeen, UK. CEUR workshop proceedings, 3822. Aachen: CEUR-WS [online], pages 47-55. Available from: https://ceur-ws.org/Vol-3822/short6.pdf
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 2024 SICSA (Scottish Informatics and Computer Science Alliance) REALLM (Reasoning, explanation and applications of large language models) workshop (SICSA REALLM workshop 2024) |
Start Date | Oct 17, 2024 |
Acceptance Date | Oct 1, 2024 |
Online Publication Date | Oct 17, 2024 |
Publication Date | Nov 4, 2024 |
Deposit Date | Dec 5, 2024 |
Publicly Available Date | Dec 5, 2024 |
Publisher | CEUR-WS |
Peer Reviewed | Peer Reviewed |
Pages | 47-55 |
Series Title | CEUR-workshop proceedings |
Series Number | 3822 |
Series ISSN | 1613-0073 |
Keywords | Text embeddings; Legal AI; RAG; CBR; Legal question answering; Retrieval |
Public URL | https://rgu-repository.worktribe.com/output/2613537 |
Publisher URL | https://ceur-ws.org/Vol-3822/ |
Files
JAYAWARDENA 2024 SCaLe-QA (VOR)
(1.3 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
Copyright Statement
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
You might also like
FedSim: similarity guided model aggregation for federated learning.
(2021)
Journal Article
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search