Frayling, Lora, Suarj Bharat, Shah, Pattinson, Elizabeth, Stock, Joshua, Lugg-Widger, Fiona ORCID: https://orcid.org/0000-0003-0029-9703, Gordon, Emma and Oliver, Emily
2025.
Review of synthetic data terminology for privacy preserving use cases.
International Journal of Population Data Science
10
(2)
, 08.
10.23889/ijpds.v10i2.2967
|
|
PDF
- Published Version
Available under License Creative Commons Attribution. Download (635kB) |
Abstract
Synthetic data is emerging as a key area of development for supporting research that involves secure forms of administrative and health data, both in the United Kingdom and globally. In practice, key challenges in the generation and adoption of synthetic data are closely tied to the need for agreed and consistent terminology for describing it. The absence of standardised language hinders the setting of quality standards, establishment of governance and guidelines and effective sharing of knowledge and best practices. This has implications for research that uses synthetic healthcare and administrative data, particularly when such data are generated from protected personal data. This commentary paper reviews existing literature on synthetic data to explore how key terms are currently defined in practice, with a focus on privacy-preserving use cases. Our analysis reveals that terms describing properties of synthetic data are often lacking and inconsistent, largely due to the breadth of synthetic data types, contexts and use cases. Context-specific terminology with nuanced meanings complicates efforts for the development of universally agreed definitions, particularly for privacy-preserving synthetic data that captures characteristics from protected data sources. To address this, we propose broad definitions for key terms including synthetic data, utility, utility measure and fidelity. We conclude by offering a set of recommendations emphasising the need for consensus on terminology and encouraging clearer descriptions in future literature that specify both the intended use of the data and the measures used to describe it.
| Item Type: | Article |
|---|---|
| Date Type: | Publication |
| Status: | Published |
| Schools: | Schools > Medicine Research Institutes & Centres > Centre for Trials Research (CNTRR) |
| Publisher: | Swansea University |
| ISSN: | 2399-4908 |
| Date of First Compliant Deposit: | 27 October 2025 |
| Date of Acceptance: | 27 June 2025 |
| Last Modified: | 27 Oct 2025 12:45 |
| URI: | https://orca.cardiff.ac.uk/id/eprint/181910 |
Actions (repository staff only)
![]() |
Edit Item |





Altmetric
Altmetric