johnmartinsson/bird-species-classification

Time Shift Data Augmentation

Closed this issue · 1 comments

Implement a method which randomly shifts the time-frequency input data in the time domain.

The shifted time data will be used as additional samples when training the neural network in order to encourage time-shift invariance.

Implementation details:

  1. split spectrogram in two parts (at random)
  2. place the second part in front of the first

Sprengel et al, 2016

"Every time we present the neural network with a training example, we shift it in time by a random amount. In terms of the spectrogram this means that we cut it into two parts and place the second part in front of the first (wrap around shifts). This creates a sharp corner where the end of the second part meets the beginning of the first part but all the information is preserved. With this augmentation we force the network to deal with irregularities in the spectrogram and also, more importantly, teach the network that bird songs/calls appear at any time, independent of the bird species."

Time shift data augmentation is now implemented. This is done in the time domain instead of the spectral domain.