Python code for GMM-UBM and MAP adaptation based speaker verification Citation: [1] Z.-H. Tan, A.k. Sarkara and N. Dehakb, "rVAD: an unsupervised segment-based robust voice activity detection method," Computer Speech and Language, 2019. where speaker verification is used as one down-stream application of VAD. Code was tested on python 2.7 0/workflow of code: feature extraction -->> GMM-UBM-training -->> GMM-UBM+MAP (target model) -->> Scoring [log likelihood ratio] (1)/ Feature extraction(MFCC+rasta, vad, cmn): ========================================================================= -1.1) First create the list file for feature extraction i.e. "feat.lst" Contents: [1st column] -> source wave file, [2nd column] - > destination feature file e.g. wav/reddots_r2015q4_v1/pcm/m0067/20150611185947809_m0067_840.wav,feat/reddots_r2015q4_v1/pcm/m0067/20150611185947809_m0067_840.htk wav/reddots_r2015q4_v1/pcm/m0067/20150701164352946_m0067_39.wav,feat/reddots_r2015q4_v1/pcm/m0067/20150701164352946_m0067_39.htk 1.2) run the following command in "Bash shell" >> OMP_NUM_THREADS=1 python featureExtract.py [#] Change the following parameters as per your requirement for the feature extraction in "featureExtract.py" e.g. (default) winlen, ovrlen, pre_coef, nfilter, nftt = 0.025, 0.01, 0.97, 20, 512 #[window size (sec)], [frame shift(sec)], [pre-emp coeff], #[no. of filter in MFCC], [N-point FFT] [#] If you don't like to apply the "default RASTA filtering" on MFFC -please "comment the following line in "mfcc.py" t=rastaFilter(t).T and make "t=t.T" [#] Default vad: energy threshold i.e. "opts==1" -To incorporate "rVAD label generated by matlab" .. - please make "opts==0" and then follow the instruction to plugin the vad file inside the code "featureExtract.py" [#] To discard VAD - put "opts= value except 0 or 1" e.g. "opts==3" [#] To discard "cmn", comment the folowing line in "featureExtract.py" - f=cmvn(f) i.e. "#f=cmvn(f)" (2)/GMM-UBM-training ================================================================================ 2.1) First, create the list file for the "GMM/UBM" training data i.e. "UBM.lst" e.g. [each row contents the feature file] feat/TIMIT/TEST/DR1/FAKS0/SA1.htk feat/TIMIT/TEST/DR1/FAKS0/SA2.htk feat/TIMIT/TEST/DR1/FAKS0/SI1573.htk feat/TIMIT/TEST/DR1/FAKS0/SI2203.htk **Importante note: it first discards the "only single frame/feature vector" before start "UBM training". -Due to the different way of indexing "array/matrix" element in python 2.2) run the following command in "Bash shell" >> OMP_NUM_THREADS=1 python GMMtrn.py [#] Default parameter(edit the following parameters as per your requirement, different way of training GMM) in "GMMtrn.py" nmix, dsfactor, rmd, emIter =4, 10, 0, 5 #[mixture power of 2], dfactor= decimination of frames during itermediate UBM training/file (speed up),[EM i ter] # rmd =1 ; 1) randomize frames --> 2) decimination [llh may not increasing in EM for interm. model] [#] Default directory of saving GMM (change it as per your requirement) ubmDir= 'GMM' + str(nmix) (3)/ GMM-UBM+MAP (target model) ================================================================================= 3.1) First, prepare the list file for the target model derived from UBM i.e. "target.ndx" e.g.[1st column] --> target model id, [2nd column] --> feature file m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150130084154554_m0001_31.htk m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150130084155412_m0001_31.htk m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150130084156114_m0001_31.htk m0001_32,feat/reddots_r2015q4_v1/pcm/m0001/20150130084156879_m0001_32.htk m0001_32,feat/reddots_r2015q4_v1/pcm/m0001/20150130084157752_m0001_32.htk m0001_32,feat/reddots_r2015q4_v1/pcm/m0001/20150130084158439_m0001_32.htk m0001_33,feat/reddots_r2015q4_v1/pcm/m0001/20150130084159156_m0001_33.htk **Importante note: make sure none of the file contents "only single frame/feature vector". Please "discard those files from the list" or "duplicate the frame at least twice" -otherwise error will occur due to the different way of indexing "array/matrix" element in python 3.2) run the following command in "Bash shell" >> OMP_NUM_THREADS=1 python TargetTRN.py [#] Default parameter in [MAP] (please change it as per your requirement in "TragetTRN.py") MapItr, Tau =3, 10.0 #[no of MAP iteration], [value of relevance factor] [#] Default UBM model store in the current director with the folder name e.g GMM512 (change it per your requirement) ubmDir= 'GMM' + str(nmix) (4)/ Scoring [log likelihood ratio] ============================================================================= 4.1) First, prepare the trail list file i.e. "m_part_01.ndx" e.g. [1st column] -claimant model id, [2nd column] --> test trial feature file m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150129213253016_m0001_36.htk m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150129213254935_m0001_32.htk m0060_40,feat/reddots_r2015q4_v1/pcm/m0067/20150611185843833_m0067_36.htk **Importante note: make sure none of the file contents "only single frame/feature vector". Please "discard those file from list" or "duplicate the frame at least twice" -otherwise error will occur due to the different way of indexing "array/matrix" element in python 4.2) Set number of thread for parallel scoring (default) CORES=2 4.3) Set the score file "name and directory" (default) Scorefile='score.txt' #output file : scores 4.4) run the following command in "Bash shell" >> OMP_NUM_THREADS=1 python Scoring.py
zhenghuatan/GMM-UBM_MAP_SV
Python code for training and testing of GMM-UBM and maximum a posterirori (MAP) adaptation based speaker verification
Python