Skip to content

AI-ResearchGroup/A-Comprehensive-Survey-with-Critical-Analysis-for-Deepfake-Speech-Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 

Repository files navigation

A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection

Table II: THE CHALLENGE COMPETITIONS PROPOSED FOR DEEPFAKE SPEECH DETECTION

Challenge Competitions Years Data type Languages Public Label (train&dev/test) Audio Visual Team No. Top-1 System
ASVspoof 2015 (audio) [15] 2015 Speech English Yes/Yes Yes No 16 Ensemble
ASVspoof 2019 (LA Task) [16] 2019 Speech English Yes/Yes Yes No 48 Ensemble
DFDC [17], [18] 2020 Speech English Yes/Yes Yes Yes 2114 Ensemble
FTC [19] 2020 Speech English No/No Yes No n/a n/a
ASVspoof 2021 (LA Task) [20] 2021 Speech English Yes/Yes Yes No 41 Ensemble
ASVspoof 2021 (DF Task) [20] 2021 Speech English Yes/Yes Yes No 33 Ensemble
ADD 2022 Track 1 [21] 2022 Speech Chinese Yes/Yes Yes No 48 Single model
ADD 2022 Track 2 [21] 2022 Speech Chinese Yes/Yes Yes No 27 Single model
ADD 2022 Track 3.2 [21] 2022 Speech Chinese Yes/Yes Yes No 33 Single model
ADD 2023 Track 1.2 [22] 2023 Speech Chinese No/No Yes No 49 Ensemble model
ADD 2023 Track 2 [22] 2023 Speech Chinese No/No Yes No 16 Single model
AV-Deepfake1M [23], [24] 2024 Speech English Yes/No Yes Yes n/a n/a
ASVspoof 2024 [25] 2024 Speech English Yes/No Yes No 53 Ensemble model
SVDD 2024 [26], [27] 2024 Singing Multi-language (6) Yes/No Yes No 47 Ensemble model

Table III: PUBLIC AND BENCHMARK DATASETS PROPOSED FOR DEEPFAKE SPEECH DETECTION

Dataset Year Language Speakers (Male/Female) Utt. No. (Real/Fake) AI-Synthesized Speech Systems Speech Condition Real Speech Resources Utt. length Evaluation Metrics
ASVspoof 2015 [15](audio) 2015 English 45/61 16,651/246,500 10 Clean Speaker Volunteers 1 to 2 EER
FoR [30](audio) 2019 English 33 198,000+ 7 Clean Kaggle [31] 2.35 Acc
ASVspoof 2019 (LA task) [16](audio) 2019 English 46/61 121,461 19 Clean & Noisy Speaker Volunteers n/a EER
DFDC [32](video) 2020 English 3426 12,8154/104,500 1 Clean & Noisy Speaker Volunteers 68.8 Precision/ Recall
ASVspoof 2021 (LA task) [20](audio) 2021 English 21/27 18,452/163,114 13 Clean & Noisy Speaker Volunteers n/a EER
ASVspoof 2021 (DF task) [20](audio) 2021 English 21/27 22,617/589,212 100+ Clean & Noisy Speaker Volunteers n/a EER
WaveFake [33](audio) 2021 English, Japanese 0/2 117,985 6 Clean LJSPEECH [29] & JSUT [30] 6s/4.8s EER
KoDF [36](video) 2021 Korean 198/205 62,116/175,776 2 Clean Speaker Volunteers 90/15 (real/fake) Acc & AuC
ADD 2022 [21] 2022 Chinese 40/40 3012/24072 2 Clean AISHELL-3 [37] 1s to 10s EER
FakeAVCeleb [38](video) 2022 English 250/250 570/25000 2 Clean & Noisy Vox-Celeb2 [39] 7s AuC
In-the-Wild [40](audio) 2022 English 58 31779 0 Clean & Noisy Self-collected 4.3s EER
LAV-DF [41](video) 2022 English 153 36,431/99,873 1 Clean & Noise Vox-Celeb2 [39] 3s to 20s AP
Voc.v [42](audio) 2023 English 46/61 82,048 5 Clean & Noisy ASVspoofing 2019 LA n/a EER
CFAD [43](audio) 2023 Chinese 1023 374,000 12 Clean & Noisy & Codecs AISHELL1-3 [44], [45], MAGICDATA [46] n/a EER
PartialSpoof [47](audio) 2023 English 46/61 12,483/108,87 19 Clean & Noisy ASVspoofing 2019 0.2s-6.4s EER
LibriSeVoc [48](audio) 2023 English n/a 13,201/79,06 6 Clean & Noisy Librispeech 5s-34s EER
AV-Deepfake1M [23], [24](video) 2023 English 2,068 286,721/860,039 2 Clean & Noisy Vox-Celeb2 [33] 5s-35s Acc & AuC
MLAAD [49](audio) 2024 Multi-Language(23) n/a 76,000 54 Clean & Noisy M-AILABS [50] n/a Acc.
ASVspoof 2024 [25](audio) 2024 English 964/958 188,819/815,262 28 Clean & Noisy MLS [51] n/a EER
SVDD2024 [26](audio) 2024 Multi-Language (6) 59 12,169/72,235 48 Clean Mandarin & Japanese [27] n/a EER

Table IV: DEEPFAKE SPEECH GENERATION SYSTEMS USED IN PUBLIC DSD DATASETS (TTS: TEXT TO SPEECH, VC: VOICE CONVERSION, AT: ADVERSARIAL ATTACH USING MALAFIDE OR MALOCOPULA)

Datasets Year No. of TTS/VC/AT Deepfake Speech Generation Systems
ASVspoof 2015 [15] 2015 7 VC, 3 TTS VC-01 [52], [53], VC-02 [54], TTS-01 [55], TTS-02 [55], VC-03 [56], VC-04 [57], VC-05 [57], VC-06 [58], VC-07 [59], TTS-03 [60]
FoR [30] 2019 7 TTS Deep Voice 3, Amazon AWS Polly, Baidu TTS, Google Traditional TTS, Google Cloud TTS, Google Wavenet TTS, Microsoft Azure TTS
ASVspoof 2019 (LA task) [16] 2019 8 VC, 11 TTS TTS-01 [61], TTS-02 [61], [62], TTS-03 [63], TTS-04 [64], VC-01 [65], VC-02 [66], TTS-05 [63], [67], TTS-06 [61], [68], TTS-07 [69], [70], TTS-08 [71], [72], TTS-09 [71], [72], [73], TTS-10 [74], VC-03+TTS [75], VC-04+TTS [76], [77], VC-05+TTS [76], [77], TTS-11 [64], VC-06 [78], [79], VC-07 [80], [81], [82], VC-08 [66]
DFDC [32] 2020 1 TTS TTS Skins voice conversion [83]
KoDF [36] 2021 2 TTS ATFHP [84] and Wav2Lip [85]
ASVspoof 2021 (LA task) [20] 2021 13 TTS/VC Reuse ASVspoof 2019
ASVspoof 2021 (DF task) 20] 2021 100 TTS/VC Vocoders [86]
WaveFake [33] 2021 6 TTS MelGAN [87], FB-MelGAN [87], HiFi-GAN [88], WaveGlow [89], PWG [90], MB-MelGAN [87]
FakeAVCeleb [38] 2022 2 TTS SV2TTS [91], [92]
In-the-Wild [40] 2022 n/a n/a
LAV-DF [41] 2022 1 TTS SV2TTS [93]
Voc.v [42] 2023 5 TTS HiFi-GAN [88], MB-MelGAN [87], WaveGlow [89], PWG [90], Hn-NSF [94]
CFAD [43] 2023 11 TTS STRAIGHT [95], Griffin-Lim [96], LPCNet [97], WaveNet [98], PWG [90], HiFi-GAN [99], MB-MelGAN [87], MelGAN [87], WORLD [100], FastSpeech [101], Tacotron-HifiGAN [102]
PartialSpoof [47] 2023 21 TTS/VC Reuse ASVspoof 2019
LibriSeVoc [48] 2023 6 TTS/VC WaveNet [98], WaveRNN [103], MelGAN [87], Parallel WaveGAn [104], WaveGrad [105], DiffWave [106]
AV-Deepfake1M [23], [24] 2023 2 TTS VITS [107], YoursTTS [108]
MLAAD [49] 2024 54 TTS Bark, Capacitron, FastPitch, GlowTTS, Griffin Lim, Jenny, NeuralHMM, Overflow, Parler TTS, Speech5, Tacotron DDC, Tacotron2, Tacotron2 DCA, Tacotron2 DH, Tcotron2-DDC, Tortoise, VITS, VITS Neon, VITS-MMS, XTTS v1.1, XTTS v2
ASVspoof 2024 [25] 2024 15 TTS, 6 VC, 7 AT TTS-01 [109], TTS-02 [110], TTS-03 [111], TTS-04 [112], TTS-05 [113], TTS-06 [114], TTS-07 [115],TTS-08(self-develop), VC-01 [116], TTS-09 [117], VC-02 [118], VC-03(self-develop), TTS-10 [119], AT-01 (Malafide+TTS-10 [119]), TTS-11 [120], AT-02(self-Develop), TTS-12 [121], TTS-13 [122], AT-03(Malafide+TTS [123]), VC-04(self-develop), VC-05 [124], VC-06(add noise), AT-04(Malacopula+VC-06), TTS-14 [125], TTS-15 [126], AT-05(Malacopula+AT-01), AT-06(Malacopula+TTS-13 [122]), AT-07(Malacopula+VC-05 [124])

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published