
Audio deepfake detection

Primary LanguagePython


This project addresses the task deepfake audio detection. It is the code associated to the paper:

Oneata, D., Stan, A., Pascu, O., Oneata, E., & Cucu, H. (2023). Towards generalisable and calibrated synthetic speech detection with self-supervised representations. arXiv preprint arXiv:2309.05384.

Set up

pip install -e .

Extract features

Extract features for the test sets; for example:

for d in asvspoof19 in-the-wild; do
    python aletheia/scripts/extract_features.py -d $d -s test -f wav2vec2-xls-r-2b

Extract features for the train sets; for example:

for split in train valid; do
    for num in 1000 2000 4000 8000; do
        for seed in 0 1 2; do
            python aletheia/scripts/extract_features.py -d asvspoof19 -s ${split} -f wav2vec2-xls-r-2b --subset ${num}-${seed}

Train and evaluate model

Table 2 in paper:

python aletheia/scripts/evaluate_feature_type.py

Figure 1 in paper:

streamlit run aletheia/scripts/plot_num_training_samples.py

Figure 2 (uncertainty estimation and reliability) in paper:

streamlit run aletheia/scripts/evaluate_reliability_ours_vs_salvi.py


The name of the project, "Alteheia", comes from Greek and means "unconcealedness", "disclosure", "revealing"; see the corresponding Wikipedia entry for more information.