Alali, Abdulazeez, Theodorakopoulos, George ![]() ![]() |
Preview |
PDF
- Accepted Post-Print Version
Download (297kB) | Preview |
Abstract
The rise of synthetic and manipulated audio content, especially partial fake speech, presents significant challenges for verifying audio authenticity. Partial fake speech refers to segments of audio in which only certain parts have been altered or synthesized, making it more difficult to detect compared to fully synthetic speech. This paper introduces a novel detection model specifically designed to identify partial fake speech. Our approach incorporates Wav2Vec 2.0 as a feature extractor, along with max pooling, conformer blocks, attention-based pooling, and fully connected layers. Experimental results on two datasets demonstrate the model’s effectiveness in detecting partial fake speech. Our models outperforms existing methods in terms of Equal Error Rate (EER), achieving 0% on the RFP dataset and 2.99% on the ASVSpoof 2019 LA dataset.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Schools > Computer Science & Informatics |
Publisher: | IEEE |
ISBN: | 979-8-3315-3592-6 |
Date of First Compliant Deposit: | 27 August 2025 |
Date of Acceptance: | 27 May 2025 |
Last Modified: | 02 Sep 2025 16:45 |
URI: | https://orca.cardiff.ac.uk/id/eprint/180669 |
Actions (repository staff only)
![]() |
Edit Item |