The code of master's thesis

This is the code of master's thesis which contains my previous two work (MAAC and FeatureCut).

Download the Clotho dataset for DCASE2021 Automated Audio Captioning challenge. And how to prepare training data and setup coco caption, please refer to Dcase2020 BUPT team's
Enter the audio_tag directory.
Firstly, run python generate_word_list.py to create words list word_list_pretrain_rules.p and tagging words to indexes of embedding layer TaggingtoEmbs.
Then run python generate_tag.py to generate audioTagName_{development/validation/evaluation}_fin_nv.pickle and audioTagNum_{development/validation/evaluation}_fin_nv.pickle

The training configuration is saved in the hparams.py and you can reset it to your own parameters.

Run python run_newtransformer.py to train the KAMA-AC-T model.
Run python run_lstm.py to train the KAMA-AC-L model.
In the files run_lstm.py or run_newtransformer.py, you can modify hyper-parameters directly to run the ablations.

Run python run_featurecut.py to train the KAMA-AC-L model.
In the files run_lstm.py or run_newtransformer.py, you can modify hyper-parameters directly to run the ablations

Vancause/KAMA_AC