simpleshinobu/visdial-principles

Code for Model zoo HCIAE, CoAtt, and RvA

Closed this issue · 1 comments

Thanks for the great work.
Could you provide us the code for HCIAE, CoAtt, and RvA which leads to the results reported on the paper?

First, sorry for my late reply, as you can see, I rarely check github. Please contact me with my email and you can get a quicker reply. Thank you for your interest. It is easy to follow the existing repositories from the original authors' encoders and our principles to implement. For convenience, I reproduce the details and update the repository. Furthermore, I also provide some clues to introduce the details of implementations on VisDial-Bert (the state of the art encoder/model), which help us to get ~78% NDCG on Visual Dialog Challenge 2020 test-std set. As the frameworks are totally different, I suggest you can directly modify that repository. Futhermore, I am interested in your recent work on this field and learn a lot, hope for more communication with you in the future.