Cog Sci 2019: Comparing unsupervised speech learning directly to human performance in speech perception

This repository contains large binary files stored as Git LFS. Install Git LFS (https://help.github.com/en/articles/installing-git-large-file-storage) before cloning.

This repository contains everything needed to replicate the human experiment (stimuli, experimental scripts), the model training (DPGMM code, requires MATLAB; corpora available on request), and the analysis (trained models, data, data analysis code), from the paper,

Millet, Juliette, Jurov, Nika, and Dunbar, Ewan. Comparing unsupervised speech learning directly to human performance in speech perception. To appear in the proceedings of Cog Sci 2019, Montreal.

Reproducing the data analysis

The R Markdown file analysis.Rmd contains the complete set of analyses from the paper, and should be knitted to reproduce them
The output can be seen in analysis.html and the figures, in particular, under figures/
A secondary, post-hoc analysis with an additional filtering, mentioned in the paper, can be done by knitting analysis__filtered.Rmd
Experimental data and acoustic/model distances, as well as cached results of long data analysis steps, are contained under data

The R Markdown file depends on the following packages:

magrittr
dplyr
ggplot2
readr
knitr

Reproducing the human experiment

The current repository contains the following files inside the folder experiment needed for the ABX discrimination task experiment construction:

scripts folder containing the scripts, needed to generate the tables, to cut the sound files, resample them etc.;
outputs folder containing all the tables, .csvs or other text files, generated by the scripts;
stimuli folder, containing the recorded .wav files and the annotated corresponding .TextGrids. It includes the intervals folder, where all the segmented intervals are stored, and the triplets folder, where all the sound files used for the experiment are (concatenated three intervals: A, B and X sound);
lmeds_material folder, where all the necessary files for the online experiment are stored;
analysis folder containing the anonymized data, the scripts necessary for the analysis as well as the outputs of those scripts.

See experiment/README.md for details on reproducing the experiment.

Reproducing the DPGMM training

Training corpora are open source and are available on request (they are too large for this repository). Further details on reproducing training can be found in README_DPGMM_training.md. Trained model (.mat) files can be found in models/.

Applying the trained models to the experimental stimuli

The Python code in this section has the following dependencies:

Python 3
librosa
scipy
textgrid
pandas

To generate MFCC and DPGMM posteriorgram features

If you want to transform the entire source files and then cut them (as we did in the paper), you first need to extract features for the uncut wav files:

mkdir -p stimulus_features
python script_create_features_files.py mfccs Stimuli/wavs_source stimulus_features

Then extract posteriors:

mkdir -p english_posteriors french_posteriors
python script_extract_posteriors.py stimulus_features models/English_vtln_1501.mat english_posteriors
python script_extract_posteriors.py stimulus_features models/French_vtln_1501.mat french_posteriors

Finally, cut out the relevant parts of the posteriorgrams:

mkdir -p english_posteriors_cut french_posteriors_cut
python script_cut_from_text_grid.py --excluded-words JE,STOCKE,ICI,I,LIKE,HERE,sp word english_posteriors_cut english_posteriors_meta.csv 25 10 Stimuli/textgrids/Cecilia_ABX_ENG_corrected.TextGrid,english_posteriors/Cecilia_ABX_ENG_clean.csv Stimuli/textgrids/Maureen_ABX_ENG_corrected.TextGrid,english_posteriors/Maureen_ABX_ENG_clean.csv Stimuli/textgrids/Cecilia_ABX_FR_corrected.TextGrid,english_posteriors/Cecilia_ABX_FR_clean.csv Stimuli/textgrids/Maureen_ABX_FR_corrected.TextGrid,english_posteriors/Maureen_ABX_FR_clean.csv Stimuli/textgrids/Ewan_ABX_ENG_corrected.TextGrid,english_posteriors/Ewan_ABX_ENG_clean.csv Stimuli/textgrids/Remi_ABX_FR_corrected.TextGrid,english_posteriors/Remi_ABX_FR_clean.csv Stimuli/textgrids/Jeremy_ABX_ENG_corrected.TextGrid,english_posteriors/Jeremy_ABX_ENG_clean.csv Stimuli/textgrids/Veronique_ABX_ENG_corrected.TextGrid,english_posteriors/Veronique_ABX_ENG_clean.csv Stimuli/textgrids/Marc_ABX_FR_corrected.TextGrid,english_posteriors/Marc_ABX_FR_clean.csv Stimuli/textgrids/Veronique_ABX_FR_corrected.TextGrid,english_posteriors/Veronique_ABX_FR_clean.csv
python script_cut_from_text_grid.py --excluded-words JE,STOCKE,ICI,I,LIKE,HERE,sp word french_posteriors_cut french_posteriors_meta.csv 25 10 Stimuli/textgrids/Cecilia_ABX_ENG_corrected.TextGrid,french_posteriors/Cecilia_ABX_ENG_clean.csv Stimuli/textgrids/Maureen_ABX_ENG_corrected.TextGrid,french_posteriors/Maureen_ABX_ENG_clean.csv Stimuli/textgrids/Cecilia_ABX_FR_corrected.TextGrid,french_posteriors/Cecilia_ABX_FR_clean.csv Stimuli/textgrids/Maureen_ABX_FR_corrected.TextGrid,french_posteriors/Maureen_ABX_FR_clean.csv Stimuli/textgrids/Ewan_ABX_ENG_corrected.TextGrid,french_posteriors/Ewan_ABX_ENG_clean.csv Stimuli/textgrids/Remi_ABX_FR_corrected.TextGrid,french_posteriors/Remi_ABX_FR_clean.csv Stimuli/textgrids/Jeremy_ABX_ENG_corrected.TextGrid,french_posteriors/Jeremy_ABX_ENG_clean.csv Stimuli/textgrids/Veronique_ABX_ENG_corrected.TextGrid,french_posteriors/Veronique_ABX_ENG_clean.csv Stimuli/textgrids/Marc_ABX_FR_corrected.TextGrid,french_posteriors/Marc_ABX_FR_clean.csv Stimuli/textgrids/Veronique_ABX_FR_corrected.TextGrid,french_posteriors/Veronique_ABX_FR_clean.csv

Similarly, cut out the relevant parts of the MFCC features:

mkdir -p stimulus_features_cut
python script_cut_from_text_grid.py --excluded-words JE,STOCKE,ICI,I,LIKE,HERE,sp word stimulus_features_cut stimulus_meta.csv 25 10 Stimuli/textgrids/Cecilia_ABX_ENG_corrected.TextGrid,stimulus_features/Cecilia_ABX_ENG_clean.csv Stimuli/textgrids/Maureen_ABX_ENG_corrected.TextGrid,stimulus_features/Maureen_ABX_ENG_clean.csv Stimuli/textgrids/Cecilia_ABX_FR_corrected.TextGrid,stimulus_features/Cecilia_ABX_FR_clean.csv Stimuli/textgrids/Maureen_ABX_FR_corrected.TextGrid,stimulus_features/Maureen_ABX_FR_clean.csv Stimuli/textgrids/Ewan_ABX_ENG_corrected.TextGrid,stimulus_features/Ewan_ABX_ENG_clean.csv Stimuli/textgrids/Remi_ABX_FR_corrected.TextGrid,stimulus_features/Remi_ABX_FR_clean.csv Stimuli/textgrids/Jeremy_ABX_ENG_corrected.TextGrid,stimulus_features/Jeremy_ABX_ENG_clean.csv Stimuli/textgrids/Veronique_ABX_ENG_corrected.TextGrid,stimulus_features/Veronique_ABX_ENG_clean.csv Stimuli/textgrids/Marc_ABX_FR_corrected.TextGrid,stimulus_features/Marc_ABX_FR_clean.csv Stimuli/textgrids/Veronique_ABX_FR_corrected.TextGrid,stimulus_features/Veronique_ABX_FR_clean.csv

Computing distances

To compute distances on posteriorgrams,

python script_ABX.py stimuli/triplets_list.csv english_posteriors_cut kl eng_dpgmm
python script_ABX.py stimuli/triplets_list.csv french_posteriors_cut kl fr_dpgmm

To compute distances on MFCCs,

python script_ABX.py stimuli/triplets_list.csv stimulus_features_cut cosine mfcc

This script creates 3 files: <output_prefix>_final.csv is a copy of template.csv with the distances added, <output_prefix>_results.csv is of the form A, B, X, real (right answer), result_model (model's answer) and <output_prefix>_distances.csv same as _results but with AX and BX distances at the end