[CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval
Primary LanguagePythonMIT LicenseMIT