sailpracticum: A MATLAB repository from ClovisIRex

############ README ############
Software: Vocal arousal toolkit
Version: 1.0
Author: Daniel Bone
Affiliation: Signal Analysis and Interpretation Laboratory, University of Southern California
Date: 2014
Contact: dbone@usc.edu, sail.usc.edu/~dbone
Overview: This software is designed to extract vocal arousal at the turn-level or at the frame-level (with windowing) from speech data.

**** please cite ****
Daniel Bone, Chi-Chun Lee and Shrikanth S. Narayanan, Robust Unsupervised Arousal Rating: A rule-based framework with knowledge-inspired vocal features (accepted, 2014), in: IEEE Transactions on Affective Computing

other papers using this software:
Daniel Bone, Chi-Chun Lee, Alexandros Potamianos, and Shrikanth Narayanan, "An Investigation of Vocal Arousal Dynamics in Child-Psychologist Interactions using Synchrony Measures and a Conversation-based Model", in Proceedings of InterSpeech, Singapore, 2014.
Daniel Bone, Chi-Chun Lee, and Shrikanth Narayanan, "A Robust Unsupervised Arousal Rating Framework using Prosody with Cross-Corpora Evaluation", in Proceedings of InterSpeech, Portland, OR, USA, 2012.

**** terms of use ****
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; version 2 dated June, 1991 or at your option any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

**** general operation ****
The software is run with through Matlab with a main .m file, a header file (_main.txt), and possibly additional annotation files. The output is a score file (_scores.txt/csv) in the format score, time. This program requires Praat.exe to run (in Windows there is a version that does not call the GUI). Example main and score files are provided with this software (/examples). There are 3 versions of the program:
1) arousalRating.m
	purpose: single file gets a single vocal arousal rating
	input: main file and required path variables. main file in format: input file, additional path for ouput, neutral, arousal, speaker (e.g., VAM_main.txt)
	output: score, annotated score (e.g., VAM_scores.txt)
	subfunctions: aR_extractFeats, aR_computeScores_pitchLogRaw, aR_computeScores_intensityRaw, aR_computeScores_ltas, aR_fuseScores

2) arousalRating_fullFile.m: 
	purpose: chop a long file into turns and get a rating for each turn (similar to arousalRating.m)
	intput: main file and required path variables. main file in format: input file, additional path for ouput, neutral, arousal, speaker, start time, end time
	output: score, annotated score
	subfunctions: aR_extractFeatsFullFile, aR_computeScores_pitchLogRawFullFile, aR_computeScores_intensityRawFullFile, aR_computeScores_ltasFullFile, aR_fuseScores

3) arousalRatingContinuous.m (experimental): 
	purpose: framewise (10ms) vocal arousal rating with some smoothing
	input: main file and required path variables. main file in format: input file, additional path for ouput, speaker, annotation file (annotation file in format col1: time, col2: arousal)) (e.g., CreativeIT_main.txt)
	output: score, time
	note: only provides continuous ratings that are relative (global normalization), so the absolute values should not be interpreted.
	subfunctions: aR_extractFeats, aRC_computeScores_pitchLogRaw2, aRC_computeScores_intensityRaw2, aRC_computeScores_ltas2, aRC_fuseScores, nanMedFilt, nanMedFiltNan

Other functions:
A) chopWav.m - chops wav by placing Gaussian white noise where a person is not speaking according to the VAD

General Note 1: Each version assumes there is only one speaker per file (as does Praat f0).
General Note 2: If you want to do global speaker normalization (i.e., you don't have neutral affective labels for turns), then set all files as neutral=1. This will give you a relative rating of a speaker's vocal arousal between files or turns, which means that the absolute value may have less meaning. Also remember that the software only works when there are multiple utterances from a single speaker.
General Note 3: If there are no annotations, any value can be inserted in the "arousal" spot of the main.
ClovisIRex/sailpracticum