hekj
Ph.D. student at CASIA NLPR CRIPAC. Research interests involve Machine Learning, Multimodality, and Embodied AI.
Institute of Automation, Chinese Academy of Sciences BEIJING, CHINA
Pinned Repositories
awesome-embodied-vision
Reading list for research topics in embodied vision
awesome-multimodal-ml
Reading list for research topics in multimodal machine learning
Awesome-Multimodal-Research
A curated list of Multimodal Related Research.
cvpr-latex-template
Extended LaTeX template for CVPR/ICCV papers
FDA
Official Implementation of Frequency-enhanced Data Augmentation for Vision-and-Language Navigation (NeurIPS2023)
Landmark-RxR
A human-annotated, fine-grained dataset for Vision-and-Language Navigation
Recurrent-VLN-BERT
Code of the CVPR 2021 Oral paper: A Recurrent Vision-and-Language BERT for Navigation
RxR
Room-across-Room (RxR) is a large-scale, multilingual dataset for Vision-and-Language Navigation (VLN) in Matterport3D environments. It contains 126k navigation instructions in English, Hindi and Telugu, and 126k navigation following demonstrations. Both annotation types include dense spatiotemporal alignments between the text and the visual perceptions of the annotators
VLN-BEVBert
[ICCV 2023} Official repo of "BEVBert: Multimodal Map Pre-training for Language-guided Navigation"
hekj's Repositories
hekj/FDA
Official Implementation of Frequency-enhanced Data Augmentation for Vision-and-Language Navigation (NeurIPS2023)
hekj/Landmark-RxR
A human-annotated, fine-grained dataset for Vision-and-Language Navigation
hekj/awesome-embodied-vision
Reading list for research topics in embodied vision
hekj/awesome-multimodal-ml
Reading list for research topics in multimodal machine learning
hekj/Awesome-Multimodal-Research
A curated list of Multimodal Related Research.
hekj/cvpr-latex-template
Extended LaTeX template for CVPR/ICCV papers
hekj/Recurrent-VLN-BERT
Code of the CVPR 2021 Oral paper: A Recurrent Vision-and-Language BERT for Navigation
hekj/RxR
Room-across-Room (RxR) is a large-scale, multilingual dataset for Vision-and-Language Navigation (VLN) in Matterport3D environments. It contains 126k navigation instructions in English, Hindi and Telugu, and 126k navigation following demonstrations. Both annotation types include dense spatiotemporal alignments between the text and the visual perceptions of the annotators
hekj/VLN-BEVBert
[ICCV 2023} Official repo of "BEVBert: Multimodal Map Pre-training for Language-guided Navigation"