kkoutini/PaSST

Pretrained models config

Opened this issue · 3 comments

Hi, How can I know configurations used for pre-training models?
e.g. u_patchout, s_patchout_t, s_patchout_f etc...

Thank you!

Hi,
Here is the config for the main pre-trained models:

passt-s-f128-p16-s10-ap.476.pt: {"embed": "default", "input_fdim": 128, "input_tdim": 998, "s_patchout_f": 4, "s_patchout_t": 40, "tstride": 10, "arch": "deit_base_distilled_patch16_384", "pretrained": true, "n_classes": 527, "in_channels": 1, "fstride": 10, "u_patchout": 0, "audioset_pretrain": false, "instance_cmd": "get_model"} 


passt-s-f128-p16-s10-ap.472.pt: {"embed": "default", "arch": "passt_deit_bd_p16_384", "pretrained": true, "n_classes": 527, "in_channels": 1, "fstride": 10, "tstride": 10, "input_fdim": 128, "input_tdim": 998, "u_patchout": 0, "s_patchout_t": 40, "s_patchout_f": 4, "instance_cmd": "get_model"} 


passt-s-f128-p16-s14-ap.469_swa471.pt: {"embed": "default", "arch": "passt_deit_bd_p16_384", "fstride": 14, "s_patchout_f": 3, "s_patchout_t": 30, "tstride": 14, "pretrained": true, "n_classes": 527, "in_channels": 1, "input_fdim": 128, "input_tdim": 998, "u_patchout": 0, "instance_cmd": "get_model"} 


passt-s-f128-p16-s16-ap.468_swa473: {"embed": "default", "fstride": 16, "input_fdim": 128, "input_tdim": 998, "s_patchout_f": 1, "s_patchout_t": 20, "tstride": 16, "arch": "deit_base_distilled_patch16_384", "pretrained": true, "n_classes": 527, "in_channels": 1, "u_patchout": 0, "audioset_pretrain": false, "instance_cmd": "get_model"} 


passt-s-f128-p16-s12-ap.470_swa473.pt: {"embed": "default", "arch": "passt_deit_bd_p16_384", "fstride": 12, "s_patchout_f": 3, "s_patchout_t": 40, "tstride": 12, "pretrained": true, "n_classes": 527, "in_channels": 1, "input_fdim": 128, "input_tdim": 998, "u_patchout": 0, "instance_cmd": "get_model"} 

If you need the configuration for other specific models, let me know.

Hi,
Thank you so much for replying.

I have another question.
Is there any pre-training model with the architecture of PaSST-U or PaSST-B?
I would like to know which model has configurations that 'u_patchout' is not zero (PaSST-U) or u_patchout', 's_patchout_t/f' (PaSST-B) are all zero.
Because I want to compare them.

Thank you!

Hi, I trained two new models with no patchout and with unstructured patchout of 600 and uploaded them [here] (https://github.com/kkoutini/PaSST/releases/tag/v.0.0.7-audioset). I hope this helps