Generative language models for paragraph-level question generation

Ushio, Asahi, Alva Manchego, Fernando and Camacho Collados, Jose

2022. Generative language models for paragraph-level question generation. Presented at: Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, UAE, 7-11 December 2022. Published in: Goldberg, Y., Kozareva, Z. and Zhang, Y. eds. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp. 670-688. 10.18653/v1/2022.emnlp-main.42

PDF - Published Version
Available under License Creative Commons Attribution.
Download (1MB)

Official URL: http://dx.doi.org/10.18653/v1/2022.emnlp-main.42

Abstract

Powerful generative models have led to recent progress in question generation (QG). However, it is difficult to measure advances in QG research since there are no standardized resources that allow a uniform comparison among approaches. In this paper, we introduce QG-Bench, a multilingual and multidomain benchmark for QG that unifies existing question answering datasets by converting them to a standard QG setting. It includes general-purpose datasets such as SQuAD for English, datasets from ten domains and two styles, as well as datasets in eight different languages. Using QG-Bench as a reference, we perform an extensive analysis of the capabilities of language models for the task. First, we propose robust QG baselines based on fine-tuning generative language models. Then, we complement automatic evaluation based on standard metrics with an extensive manual evaluation, which in turn sheds light on the difficulty of evaluating QG models. Finally, we analyse both the domain adaptability of these models as well as the effectiveness of multilingual models in languages other than English.QG-Bench is released along with the fine-tuned models presented in the paper (https://github.com/asahi417/lm-question-generation), which are also available as a demo (https://autoqg.net/).

Item Type:	Conference or Workshop Item - published (Paper)
Date Type:	Publication
Status:	Published
Schools:	Professional Services > Advanced Research Computing @ Cardiff (ARCCA) Schools > Computer Science & Informatics
Publisher:	Association for Computational Linguistics
Date of First Compliant Deposit:	6 September 2023
Last Modified:	14 Jun 2024 15:24
URI:	https://orca.cardiff.ac.uk/id/eprint/161897

Actions (repository staff only)

Edit Item

Altmetric

Dimensions

Download Statistics

Downloads

Downloads per month over past year

View more statistics

CORE (COnnecting REpositories)