SunzeY/AlphaCLIP

Table 6: Performance of Alpha-CLIP in region level captioning

Opened this issue · 1 comments

Great work!
I am confused with Tab .6 result, the performance is Alpha-CLIP with LLaVA-1.5 or fine-tune this model with vicuna-7b on these datasets(RefCOCOg or VG)?

Hi, This have been discussed in #24.