The problem of Data-to-Text Generation (D2T) is usually solved using a modular approach by breaking the generation process into some variant of planning and realisation phases. Traditional methods have been very good at producing high quality texts but are difficult to build for complex domains and also lack diversity. On the other hand, current neural systems offer scalability and diversity but at the expense of being inaccurate. Case-Based approaches try to mitigate the accuracy and diversity trade-off by providing better accuracy than neural systems and better diversity than traditional systems. However, they still fare poorly against neural systems when measured on the dimensions of content selection and diversity. In this work, a Case-Based approach for content-planning in D2T, called CBR-Plan, is proposed which selects and organises the key components required for producing a summary, based on similar previous examples. Extensive experiments are performed to demonstrate the effectiveness of the proposed method against a variety of benchmark and baseline systems, ranging from template-based, to case-based and neural systems. The experimental results indicate that CBR-Plan is able to select more relevant and diverse content than other systems.
UPADHYAY, A. and MASSIE, S. 2022. A case-based approach for content planning in data-to-text generation. In Keane, M.T. and Wiratunga, N. (eds.) Case-based reasoning research and development: proceedings of the 30th International conference on case-based reasoning (ICCBR 2022), 12-15 September 2022, Nancy, France. Lecture notes in computer science, 13405. Cham: Springer [online], pages 380-394. Available from: https://doi.org/10.1007/978-3-031-14923-8_25