GenjiB/LAVISH

Vision Transformers are Parameter-Efficient Audio-Visual Learners

Python

Issues

Need few more details to debug the code for my task
#24 opened 5 months ago by praveena2j
2
About performance in AVQA
#21 opened 5 months ago by kaiw7
1
Can't get the accuracy of AVE reported in the paper with vit_base(75.3%)
#17 opened a year ago by Lecooo
3
dataset shape not match in AVS
#18 opened a year ago by zsevenj
5
avsbench shape not match
#13 opened a year ago by Everglow-ZJU
1
The output of vision transformers models are Nans
#23 opened 8 months ago by praveena2j
0
How many gpu days does the training procedure of AVE take?
#22 opened 8 months ago by laulliam
0
training logs for AVQA
#20 opened 10 months ago by kaiw7
0
About script of using VPT for AVE
#19 opened a year ago by kaiw7
0
Can't get th similar accuracy of AVE reported in the paper (81.1%)
#14 opened a year ago by liushenme
7
AVQA dimensional error
#12 opened a year ago by liey1
1
Questions about CrossAttention
#16 opened a year ago by kaiw7
0
some question about VisualAdapter in AVS
#15 opened a year ago by Divine0719
2
Maybe the effect of adapter is not such crucial?
#11 opened a year ago by Rainlt
2
Can't get claimed accuracy (81.1%) with the provided configs?
#9 opened a year ago by abdurad
4
Have you noticed that the AVQA model is very unstable when adjusting the lr and batch size?
#10 opened a year ago by Rainlt
1
How many gpu days does the training procedure of AVQA take?
#8 opened 2 years ago by Rainlt
4
cannot get the claimed accuracy in audio-visual event localization.
#2 opened 2 years ago by junwenxiong
2
Is this the true config?
#7 opened 2 years ago by Rainlt
1
The address of your paper is wrong?
#6 opened 2 years ago by Rainlt
1
cannot get the claimed mIOU in audio-visual segmentation.
#5 opened 2 years ago by junwenxiong
4
For audio, you use both .wav format and .npy. Here is a dimensional error in the code.
#3 opened 2 years ago by Bravo5542
3
Some bugs are hidden in the AVS repo
#4 opened 2 years ago by junwenxiong
1
erro for AVQA
#1 opened 2 years ago by Bravo5542
1