The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

Primary LanguagePython


This project is the description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

The FMFCC-A dataset is shared through BaiduCloud (website: https://pan.baidu.com/s/1CGPkC8VfjXVBZjluEHsW6g , password: IIES). The FMFCC-A dataset is by far the largest publicly available Mandarin dataset for synthetic speech detection, which contains 40,000 synthesized Mandarin utterances that generated by 11 Mandarin TTS systems and two Mandarin VC systems, and 10,000 genuine Mandarin utterances collected from 58 speakers. In addition, the official website of FMFCC-A (Audio track of the first fake media forensic challenge of China Society of Image and Graphics) is http://fmfcc.net/ . We hope that the FMFCC-A dataset can fill the gap of lack of Mandarin datasets for synthetic speech detection under various audio post-processing operations.

If you find the code or dataset is usefull, please cite the following papers: FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech Detection