This file contains only basic metadata and links to where the software can be accessed. As a
result, the licence under which this file is shared on OpenAIR is not the same as the licence
used for the software itself. Please refer to the "Licence" section below for more information.

GENERAL INFORMATION

1. Title of software:
	A case-based approach to data-to-text generation.

2. Contributor information:
	Ashish Upadhyay (Robert Gordon University)
	Stewart Massie (Robert Gordon University)
	Ritwik Kumar Singh (International Institute of Information Technology)
	Garima Gupta (International Institute of Information Technology)
	Muneendra Ojha (International Institute of Information Technology)
	
3. Date on which software last updated:
	2021-10-14 (13:07 BST)
	

ACCESS INFORMATION

1. Access Links:
	GitHub:
	https://github.com/ashishu007/data2text-cbr
		
	Internet Archive (Version dated 2021-10-26):
	https://web.archive.org/web/20211026103605/https://github.com/ashishu007/data2text-cbr
	
2. Recommended citation:
	UPADHYAY, A., MASSIE, S., SINGH, R.K., GUPTA, G. and OJHA, M. 2021. A case-based approach to
	data-to-text generation. [Software]. Hosted on GitHub [online]. Available from:
	https://github.com/ashishu007/data2text-cbr 
	
3. Licence:
	This file contains only basic metadata and a link to the external website where the code is
	hosted. This file is distributed under a Creative Commons BY-NC licence
	(https://creativecommons.org/licenses/by-nc/4.0), but the code is available on GitHub under
	an MIT licence (https://github.com/ashishu007/data2text-cbr/blob/main/LICENSE).
	
	
CONTEXTUAL INFORMATION

1. Abstract:
	Traditional Data-to-Text Generation (D2T) systems utilise carefully crafted domain specific
	rules and templates to generate high quality accurate texts. More recent approaches use
	neural systems to learn domain rules from the training data to produce very fluent and
	diverse texts. However, there is a trade-off with rule-based systems producing accurate
	text but that may lack variation, while learning-based systems produce more diverse texts
	but often with poorer accuracy. This code has been used to help propose a case-based
	approach for D2T, which mitigates the impact of this trade-off by dynamically selecting
	templates from the training corpora. In our approach we develop a novel case-alignment
	based, feature weighing method that is used to build an effective similarity measure.
	Extensive experimentation is performed on a sports domain dataset. Through Extractive
	Evaluation metrics, we demonstrate the benefit of the CBR system over a rule-based baseline
	and a neural benchmark. The GitHub repository includes information on how to use the code.