Ushio, Asahi, Alva Manchego, Fernando and Camacho Collados, Jose ORCID: https://orcid.org/0000-0003-1618-7239 2022. Generative language models for paragraph-level question generation. Presented at: Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, UAE, 7-11 December 2022. Published in: Goldberg, Y., Kozareva, Z. and Zhang, Y. eds. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp. 670-688. 10.18653/v1/2022.emnlp-main.42 |
PDF
- Published Version
Available under License Creative Commons Attribution. Download (1MB) |
Abstract
Powerful generative models have led to recent progress in question generation (QG). However, it is difficult to measure advances in QG research since there are no standardized resources that allow a uniform comparison among approaches. In this paper, we introduce QG-Bench, a multilingual and multidomain benchmark for QG that unifies existing question answering datasets by converting them to a standard QG setting. It includes general-purpose datasets such as SQuAD for English, datasets from ten domains and two styles, as well as datasets in eight different languages. Using QG-Bench as a reference, we perform an extensive analysis of the capabilities of language models for the task. First, we propose robust QG baselines based on fine-tuning generative language models. Then, we complement automatic evaluation based on standard metrics with an extensive manual evaluation, which in turn sheds light on the difficulty of evaluating QG models. Finally, we analyse both the domain adaptability of these models as well as the effectiveness of multilingual models in languages other than English.QG-Bench is released along with the fine-tuned models presented in the paper (https://github.com/asahi417/lm-question-generation), which are also available as a demo (https://autoqg.net/).
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Advanced Research Computing @ Cardiff (ARCCA) Computer Science & Informatics |
Publisher: | Association for Computational Linguistics |
Date of First Compliant Deposit: | 6 September 2023 |
Last Modified: | 14 Jun 2024 15:24 |
URI: | https://orca.cardiff.ac.uk/id/eprint/161897 |
Actions (repository staff only)
Edit Item |