Cog Sci 2019: Comparing unsupervised speech learning directly to human performance in speech perception
This repository contains large binary files stored as Git LFS. Install Git LFS (https://help.github.com/en/articles/installing-git-large-file-storage) before cloning.
This repository contains everything needed to replicate the human experiment (stimuli, experimental scripts), the model training (DPGMM code, requires MATLAB; corpora available on request), and the analysis (trained models, data, data analysis code), from the paper,
Millet, Juliette, Jurov, Nika, and Dunbar, Ewan. Comparing unsupervised speech learning directly to human performance in speech perception. To appear in the proceedings of Cog Sci 2019, Montreal.
- The R Markdown file
analysis.Rmd
contains the complete set of analyses from the paper, and should be knitted to reproduce them - The output can be seen in
analysis.html
and the figures, in particular, underfigures/
- A secondary, post-hoc analysis with an additional filtering, mentioned in the paper, can be done by knitting
analysis__filtered.Rmd
- Experimental data and acoustic/model distances, as well as cached results of long data analysis steps, are contained under
data
The R Markdown file depends on the following packages:
magrittr
dplyr
ggplot2
readr
knitr
The current repository contains the following files inside the folder experiment
needed for the ABX discrimination task experiment construction:
scripts
folder containing the scripts, needed to generate the tables, to cut the sound files, resample them etc.;outputs
folder containing all the tables, .csvs or other text files, generated by the scripts;stimuli
folder, containing the recorded .wav files and the annotated corresponding .TextGrids. It includes theintervals
folder, where all the segmented intervals are stored, and thetriplets
folder, where all the sound files used for the experiment are (concatenated three intervals: A, B and X sound);lmeds_material
folder, where all the necessary files for the online experiment are stored;analysis
folder containing the anonymized data, the scripts necessary for the analysis as well as the outputs of those scripts.
See experiment/README.md
for details on reproducing the experiment.
Training corpora are open source and are available on request (they are too large for this repository). Further details on reproducing training can be found in README_DPGMM_training.md
. Trained model (.mat
) files can be found in models/
.
The Python code in this section has the following dependencies:
- Python 3
librosa
scipy
textgrid
pandas
If you want to transform the entire source files and then cut them (as we did in the paper), you first need to extract features for the uncut wav files:
mkdir -p stimulus_features
python script_create_features_files.py mfccs Stimuli/wavs_source stimulus_features
Then extract posteriors:
mkdir -p english_posteriors french_posteriors
python script_extract_posteriors.py stimulus_features models/English_vtln_1501.mat english_posteriors
python script_extract_posteriors.py stimulus_features models/French_vtln_1501.mat french_posteriors
Finally, cut out the relevant parts of the posteriorgrams:
mkdir -p english_posteriors_cut french_posteriors_cut
python script_cut_from_text_grid.py --excluded-words JE,STOCKE,ICI,I,LIKE,HERE,sp word english_posteriors_cut english_posteriors_meta.csv 25 10 Stimuli/textgrids/Cecilia_ABX_ENG_corrected.TextGrid,english_posteriors/Cecilia_ABX_ENG_clean.csv Stimuli/textgrids/Maureen_ABX_ENG_corrected.TextGrid,english_posteriors/Maureen_ABX_ENG_clean.csv Stimuli/textgrids/Cecilia_ABX_FR_corrected.TextGrid,english_posteriors/Cecilia_ABX_FR_clean.csv Stimuli/textgrids/Maureen_ABX_FR_corrected.TextGrid,english_posteriors/Maureen_ABX_FR_clean.csv Stimuli/textgrids/Ewan_ABX_ENG_corrected.TextGrid,english_posteriors/Ewan_ABX_ENG_clean.csv Stimuli/textgrids/Remi_ABX_FR_corrected.TextGrid,english_posteriors/Remi_ABX_FR_clean.csv Stimuli/textgrids/Jeremy_ABX_ENG_corrected.TextGrid,english_posteriors/Jeremy_ABX_ENG_clean.csv Stimuli/textgrids/Veronique_ABX_ENG_corrected.TextGrid,english_posteriors/Veronique_ABX_ENG_clean.csv Stimuli/textgrids/Marc_ABX_FR_corrected.TextGrid,english_posteriors/Marc_ABX_FR_clean.csv Stimuli/textgrids/Veronique_ABX_FR_corrected.TextGrid,english_posteriors/Veronique_ABX_FR_clean.csv
python script_cut_from_text_grid.py --excluded-words JE,STOCKE,ICI,I,LIKE,HERE,sp word french_posteriors_cut french_posteriors_meta.csv 25 10 Stimuli/textgrids/Cecilia_ABX_ENG_corrected.TextGrid,french_posteriors/Cecilia_ABX_ENG_clean.csv Stimuli/textgrids/Maureen_ABX_ENG_corrected.TextGrid,french_posteriors/Maureen_ABX_ENG_clean.csv Stimuli/textgrids/Cecilia_ABX_FR_corrected.TextGrid,french_posteriors/Cecilia_ABX_FR_clean.csv Stimuli/textgrids/Maureen_ABX_FR_corrected.TextGrid,french_posteriors/Maureen_ABX_FR_clean.csv Stimuli/textgrids/Ewan_ABX_ENG_corrected.TextGrid,french_posteriors/Ewan_ABX_ENG_clean.csv Stimuli/textgrids/Remi_ABX_FR_corrected.TextGrid,french_posteriors/Remi_ABX_FR_clean.csv Stimuli/textgrids/Jeremy_ABX_ENG_corrected.TextGrid,french_posteriors/Jeremy_ABX_ENG_clean.csv Stimuli/textgrids/Veronique_ABX_ENG_corrected.TextGrid,french_posteriors/Veronique_ABX_ENG_clean.csv Stimuli/textgrids/Marc_ABX_FR_corrected.TextGrid,french_posteriors/Marc_ABX_FR_clean.csv Stimuli/textgrids/Veronique_ABX_FR_corrected.TextGrid,french_posteriors/Veronique_ABX_FR_clean.csv
Similarly, cut out the relevant parts of the MFCC features:
mkdir -p stimulus_features_cut
python script_cut_from_text_grid.py --excluded-words JE,STOCKE,ICI,I,LIKE,HERE,sp word stimulus_features_cut stimulus_meta.csv 25 10 Stimuli/textgrids/Cecilia_ABX_ENG_corrected.TextGrid,stimulus_features/Cecilia_ABX_ENG_clean.csv Stimuli/textgrids/Maureen_ABX_ENG_corrected.TextGrid,stimulus_features/Maureen_ABX_ENG_clean.csv Stimuli/textgrids/Cecilia_ABX_FR_corrected.TextGrid,stimulus_features/Cecilia_ABX_FR_clean.csv Stimuli/textgrids/Maureen_ABX_FR_corrected.TextGrid,stimulus_features/Maureen_ABX_FR_clean.csv Stimuli/textgrids/Ewan_ABX_ENG_corrected.TextGrid,stimulus_features/Ewan_ABX_ENG_clean.csv Stimuli/textgrids/Remi_ABX_FR_corrected.TextGrid,stimulus_features/Remi_ABX_FR_clean.csv Stimuli/textgrids/Jeremy_ABX_ENG_corrected.TextGrid,stimulus_features/Jeremy_ABX_ENG_clean.csv Stimuli/textgrids/Veronique_ABX_ENG_corrected.TextGrid,stimulus_features/Veronique_ABX_ENG_clean.csv Stimuli/textgrids/Marc_ABX_FR_corrected.TextGrid,stimulus_features/Marc_ABX_FR_clean.csv Stimuli/textgrids/Veronique_ABX_FR_corrected.TextGrid,stimulus_features/Veronique_ABX_FR_clean.csv
To compute distances on posteriorgrams,
python script_ABX.py stimuli/triplets_list.csv english_posteriors_cut kl eng_dpgmm
python script_ABX.py stimuli/triplets_list.csv french_posteriors_cut kl fr_dpgmm
To compute distances on MFCCs,
python script_ABX.py stimuli/triplets_list.csv stimulus_features_cut cosine mfcc
This script creates 3 files: <output_prefix>_final.csv is a copy of template.csv with the distances added, <output_prefix>_results.csv is of the form A, B, X, real (right answer), result_model (model's answer) and <output_prefix>_distances.csv same as _results but with AX and BX distances at the end