birdclef2021: A Python repository from mohsinkhn

TODO: 30th April

13th May -

Generate melspecs for faster iteration, 5 sec samples
Setup SED model with densenet121, resnest50d, efficientnet-b0
Use 5 second clips
Add Gaussian, Pink, Band noise and BAD dataset as background noise
Random power for signal
Use modified mixup - change mixup to use both labels
Use secondary labels with 0.3 as target
Setup evaluation --> if bird in segment with threshold greater than T, increase prior for that bird for complete file
Submit on LB -->

17th May

effb0 baseline scores - 0.71
res50_effb0+mix+rex different strtified folds - 0.71x
Try simple kfold and average scores -- DONE - 0.73 for res26
Restructure evaluation script, provision for location wise priors - Done
res26d 5-kfold, effb0 5-kfold, res50 5-kfold - Done
Post processing techniques - In progress
Better evaluation metrics on sundscapes - Todo
Add noise from soundscapes to training - Todo
Find out echo augmentation - Todo
Band pass filter for post processing, high frequency cutoff? - Todo
log mean-max pooling from Jan Schutler - Todo - P1
combining channels with different weights - most files seem single channel - mostly mono files - not useful
30 second training - should be coupled with 0.5 weight for secondary labels
Try pretraining on audioset? - Todo
Training time improvements for faster iteration - Todo

19th May

So far what has worked - used training augmentations from vlomme
Combined res26 and effb0 preds
Moving from vlomme post processing to logit model helped a bit, may be try GBM
Backbone dicriminative experiments -
Different backbones -
External datasets
Pretraining ?
Adding nocall from current soundscapes as background noise, also changing background noises
Restarts for current trained data
More resolution ?
Separate threshold for each site - Done

External datasets and pretraining seem to be strongest levers as of now. Current options for external data are:

additional birds recordings from xeno canto - could be used to pretrain for better finetuning
soundscapes - add soundscapes from previous competitions and current one
audioset pretraining - has helped in PANN based models

mohsinkhn/birdclef2021