junchen14
I am a PhD candidate student in KAUST. I am interested in Multi-modal learning.
KAUSTSaudi Arabia
Pinned Repositories
Awesome_ChatGPT_papers
This repository will collect and share awesome ChatGPT related papers and useful tools
LoMaR
LoMaR (Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction)
MiniGPT-4_finetune
Open-sourced codes for MiniGPT-4 and MiniGPT-v2
Multi-Modal-Transformer
The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and self-supervised learning models. Additionally, it also collects many useful tutorials and tools in these related domains.
RelTransformer_GeneralVRD
video_chatcaptioner
ChatCaptioner
Official Repository of ChatCaptioner
MammalNet
MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
VisualGPT
VisualGPT, CVPR 2022 Proceeding, GPT as a decoder for vision-language models
junchen14's Repositories
junchen14/Multi-Modal-Transformer
The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and self-supervised learning models. Additionally, it also collects many useful tutorials and tools in these related domains.
junchen14/LoMaR
LoMaR (Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction)
junchen14/Awesome_ChatGPT_papers
This repository will collect and share awesome ChatGPT related papers and useful tools
junchen14/MiniGPT-4_finetune
Open-sourced codes for MiniGPT-4 and MiniGPT-v2
junchen14/RelTransformer_GeneralVRD
junchen14/video_chatcaptioner
junchen14/3DCoMPaT
Official repository for the 3DCoMPaT dataset (ECCV2022 Oral)
junchen14/AliceMind
ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab
junchen14/CLIP
Contrastive Language-Image Pretraining
junchen14/CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
junchen14/DL2Vec
Convert Description Logic axioms into a graph, and generate embedding representation for the nodes.
junchen14/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
junchen14/junchen14.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
junchen14/minigpt_spatial
junchen14/video_language
junchen14/ViLT
Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"
junchen14/VisualGPT-1
junchen14/web-llm
Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.