Aggression in Hindi and English Speech

This repository contains data, models and some utility scripts generated as part of UGC-UKIERI Project title "Automatic Detection of Verbal Threat in Hindi and English Aggressive Speech", led jointly by Dr. Ritesh Kumar, K.M. Institute of Hindi and Linguistics, Dr. Bhimrao Ambedkar University, Agra and Prof. Daniel Kadar, University of Huddersfield, UK and carried out in collaboration with University of Surrey, UK, Jawaharlal Nehru University, New Delhi, Microsoft Research India, Bangalore, UnReaL-TecE LLP and Panlingua Language Processing LLP, New Delhi.

Data

The directory 'data' in the repository contains

TextGrid files and
The training files (in SVMLight format) used for building the models. The features of speech signal were extracted using the OpenSMILE v2.2 (now its v3.0 is available) library. The model was trained using the SVM Multiclass library. These could be directly used for training and experimenting with more models without the need to extract the features again.

The raw audio files could be accessed at the following links -

The original video files are accessible via the links included in the METADATA file. The metadata file also contains the information such as mapping of audio files to their respective TextGrid files, size of different audio files, their format and their duration / length.

Models

The directory 'model' contains the best models for Hindi and English. These files are generated by the SVM Multiclass library and so are expected to work with that.

Scripts

The directory 'scripts' contains some helper Shell Scripts (used and tested on Ubuntu OS) for pre-processing the video files and generating the features for training the model. These scripts include the following -

1_video_to_audio - This script extracts the sound track from the .mp4 video files and saves in the WAV format.
2_make_compatible_audio - It converts an audio file in a format compatible with the OpenSMILE library
3_save_labeled_intervals_to_wav_sound_files - Its a PRAAT script to automatically slice an annotated sound file into multiple files
4a_extract_features_hindi - It extracts features from multiple audio files using OpenSMILE library, as per the specification of the config file of the library
4b_extract_features_english - It extracts features from multiple audio files using OpenSMILE library, as per the specification of the config file of the library

App

We plan to release the code of the test app soon. At present, the app is however accessible online via the following link - Aggression Recognition Tool ART

Feedback and Contact

For any feedback / suggestions / collaboration, please contact Dr. Ritesh Kumar @ ritesh78_llh at jnu dot ac dot in.

kmi-linguistics/speech-aggression

Aggression in Hindi and English Speech

Data

Models

Scripts

App

Feedback and Contact