praveena2j/Joint-Cross-Attention-for-Audio-Visual-Fusion

Dimension Mismatch Error with Pre-trained Model Parameters

Closed this issue · 2 comments

Hello,thanks a lot for sharing your code and pre-trained model parameters!
When testing your pre-trained model with my own dataset, I encountered an issue. Specifically, after concatenating the two features obtained through the AV Fusion Model and inputting them into the first fully connected layer of the model, the dimension becomes 16x1024. However, the first fully connected layer of the pre-trained model has weight parameters with dimensions of 512x128. This results in an error indicating a mismatch between input dimensions and weight parameters.I have thoroughly examined the code for parameter assignments but have been unable to identify the problem. Do you have any insights into possible reasons for this issue?
Thank you for your time and assistance.

I have updated the orig_cam.py script, please use the latest script, thanks

Thanks a lot!