Alali, Abdulazeez and Theodorakopoulos, George ORCID: https://orcid.org/0000-0003-2701-7809 2023. An RFP dataset for Real, Fake, and Partially fake audio detection. Presented at: Springer 9th International Conference on Cyber Security, Privacy in Communication Networks (ICCS2023), Cardiff, Wales, UK, 11-12 December 2023. |
Preview |
PDF
- Accepted Post-Print Version
Available under License Creative Commons Attribution. Download (286kB) | Preview |
Abstract
Recent advances in deep learning have enabled the creation of natural-sounding synthesised speech. However, attackers have also utilised these tech-nologies to conduct attacks such as phishing. Numerous public datasets have been created to facilitate the development of effective detection models. How-ever, available datasets contain only entirely fake audio; therefore, detection models may miss attacks that replace a short section of the real audio with fake audio. In recognition of this problem, the current paper presents the RFP da-taset, which comprises five distinct audio types: partial fake (PF), audio with noise, voice conversion (VC), text-to-speech (TTS), and real. The data are then used to evaluate several detection models, revealing that the available detec-tion models incur a markedly higher equal error rate (EER) when detecting PF audio instead of entirely fake audio. The lowest EER recorded was 25.42%. Therefore, we believe that creators of detection models must seriously consid-er using datasets like RFP that include PF and other types of fake audio.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Date Type: | Completion |
Status: | Submitted |
Schools: | Computer Science & Informatics |
Related URLs: | |
Date of First Compliant Deposit: | 24 April 2024 |
Last Modified: | 20 May 2024 10:25 |
URI: | https://orca.cardiff.ac.uk/id/eprint/167934 |
Actions (repository staff only)
Edit Item |