Skip to main content

Research Repository

Advanced Search

AlignLLM: alignment-based evaluation using ensemble of LLMs-as-judges for Q &A.

Abeyratne, Ramitha; Wiratunga, Nirmalie; Martin, Kyle; Nkisi-orj, Ikechukwu; Jayawardena, Lasal

Authors

Lasal Jayawardena



Abstract

Evaluating responses generated by large language models (LLMs) is challenging in the absence of ground-truth knowledge, particularly in specialised domains such as law. Increasingly, LLMs themselves are used to evaluate the responses they generate; however, this approach is prone to bias and inherent errors. To address these issues, we propose an unsupervised ensemble method that employs multiple general-purpose LLMs as a 'collective judge', rather than relying on a single model. Here we introduce a novel application of case alignment as an aggregation mechanism, achieving higher correlation with supervised metrics than unsupervised LLM-as-a-judge baselines. Specifically, we construct two spaces for the ensemble: one for reconstructed questions by the ensemble given the model's original responses ('problem-space'), and another for the set of answers generated in response to those reconstructed questions ('solution-space'). By applying similarity-based alignment metrics across these two spaces, we gauge how closely our ensemble-based evaluation metric correlates with accuracy-based metrics that rely on ground-truth data. Our results on two legal Q&A datasets show significant correlations using this alignment strategy, suggesting that it can effectively evaluate LLM-generated responses even when ground truth is unavailable.

Citation

ABEYRATNE, R., WIRATUNGA, N., MARTIN, K., NKISI-ORJ, I. and JAYAWARDENA, L. [2025]. AlignLLM: alignment-based evaluation using ensemble of LLMs-as judges for Q&A. In Case-based reasoning research and development: proceedings of the 33rd International conference on case-based reasoning 2025 (ICCBR 2025), 30 June - 03 July 2025, Biarritz, France. Lecture notes in computer science (LNCS), TBC. Cham: Springer [online], (forthcoming).

Presentation Conference Type Conference Paper (published)
Conference Name 33rd International conference on case-based reasoning 2025 (ICCBR 2025)
Start Date Jun 30, 2025
End Date Jul 3, 2025
Acceptance Date Mar 14, 2025
Deposit Date Mar 17, 2025
Publisher Springer
Peer Reviewed Peer Reviewed
Series Title Lecture notes in computer science (LNCS)
Series ISSN 0302-9743; 1611-3349
Book Title Case-based reasoning research and development: proceedings of the 33rd International conference on case-based reasoning 2025 (ICCBR 2025), 30 June - 03 July 2025, Biarritz, France
Keywords LLMs-as-Judges; Case-alignment; Legal Q&A
Public URL https://rgu-repository.worktribe.com/output/2754840
Related Public URLs https://rgu-repository.worktribe.com/output/2754880 (Link to code and datasets associated with this output)