yazdayy/SED-intern

Python

Sound Event Detection

Predicition System

Upload the audio clips you would like to process to a folder
- The path to this directory will be your $INPUT_DIR
- If your audio files are not in .wav format, the prediction system will automatically convert them from their current format to .wav

Run the following command:
Original:

python pytorch/predict_new.py predict_asr --dataset_dir=$INPUT_DIR --workspace=workspace --holdout_fold=1 --model_type=Cnn_9layers_Gru_FrameAtt_edit --loss_type=clip_bce --augmentation=specaugment_mixup --feature_type=logmel --batch_size=8 --cuda --audio_16k --sed_thresholds --filename=main_strong --overlap

With dynamic segmentation:

python pytorch/predict_new.py predict_silent_asr --dataset_dir=$INPUT_DIR --workspace=workspace --holdout_fold=1 --model_type=Cnn_9layers_Gru_FrameAtt_edit --loss_type=clip_bce --augmentation=specaugment_mixup --feature_type=logmel --batch_size=8 --cuda --audio_16k --sed_thresholds --filename=main_strong

The prediction output is saved in the 'workspace/predict_results' directory in the following xml format: