zmykevin

Applied Researcher at Capital One. My research interest lies on Multimodality Learning with vision and language.

New York

Pinned Repositories

A-Visual-Attention-Grounding-Neural-Model
The code repository for our multimodal machine translation project: A Visual Attention Grounding Neural Netowork
Language:Python3 5 12
ACL2023_ChartT5
The official code implementation of the ACL 2023 Finding paper: Enhanced Chart Understanding in Vision and Language Task via Cross-modal Pre-training on Plot Table Pairs
Language:Python10 1 01
bert_nli
A Natural Language Inference (NLI) model based on Transformers (BERT and ALBERT)
Language:Python0 1 00
ChartLlama-code
For the code from ChartLlama-code
Language:Python0 0 00
ChartOCR
Language:Python5 0 04
chatgpt-on-wechat
Wechat robot based on ChatGPT, which using OpenAI api and itchat library. 使用ChatGPT搭建微信聊天机器人，基于GPT3.5 API和itchat实现
Language:Python0 0 00
GanDraw
The Repository for the baselines for GanDraw Dataset
Language:Jupyter Notebook1 2 00
transform-and-tell
Code for Transform and Tell paper in CVPR 2020.
Language:Python1 1 00
UC2
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
Language:Python34 4 73
UVLP
CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
Language:Python22 2 31

zmykevin's Repositories

zmykevin/UC2
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
Language:Python34 4 73
zmykevin/UVLP
CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
Language:Python22 2 31
zmykevin/ACL2023_ChartT5
The official code implementation of the ACL 2023 Finding paper: Enhanced Chart Understanding in Vision and Language Task via Cross-modal Pre-training on Plot Table Pairs
Language:Python10 1 01
zmykevin/ChartOCR
Language:Python5 0 04
zmykevin/A-Visual-Attention-Grounding-Neural-Model
The code repository for our multimodal machine translation project: A Visual Attention Grounding Neural Netowork
Language:Python3 5 12
zmykevin/GanDraw
The Repository for the baselines for GanDraw Dataset
Language:Jupyter Notebook1 2 00
zmykevin/transform-and-tell
Code for Transform and Tell paper in CVPR 2020.
Language:Python1 1 00
zmykevin/bert_nli
A Natural Language Inference (NLI) model based on Transformers (BERT and ALBERT)
Language:Python0 1 00
zmykevin/ChartLlama-code
For the code from ChartLlama-code
Language:Python0 0 00
zmykevin/chatgpt-on-wechat
Wechat robot based on ChatGPT, which using OpenAI api and itchat library. 使用ChatGPT搭建微信聊天机器人，基于GPT3.5 API和itchat实现
Language:Python0 0 00
zmykevin/CLIP
Contrastive Language-Image Pretraining
Language:Jupyter Notebook0 1 00
zmykevin/Detectron
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
Language:Python0 2 00
zmykevin/detectron2
Detectron2 is FAIR's next-generation platform for object detection and segmentation.
Language:Python0 1 00
zmykevin/CLIPTrans
Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", published at ICCV'23.
Language:Python0 0
zmykevin/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Language:Python2 0
zmykevin/GeNeVA
Code to train and evaluate the GeNeVA-GAN model for the GeNeVA task proposed in our ICCV 2019 paper "Tell, Draw, and Repeat: Generating and modifying images based on continual linguistic instruction"
Language:Python2 0
zmykevin/GeNeVA_datasets
Scripts to generate the CoDraw and i-CLEVR datasets used for the GeNeVA task proposed in our ICCV 2019 paper "Tell, Draw, and Repeat: Generating and modifying images based on continual linguistic instruction"
Language:Python2 0
zmykevin/gquestions
Find "People Also Ask" questions
Language:HTML1 0
zmykevin/LLaVA
Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.
Language:Python0 0
zmykevin/open_clip
An open source implementation of CLIP.
Language:Python1 0
zmykevin/personal-website
Code that'll help you kickstart a personal website that showcases your work as a software developer.
zmykevin/Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
Language:Jupyter Notebook0 0
zmykevin/simple-flask-socketio-example
zmykevin/SPADE
Semantic Image Synthesis with SPADE
Language:Python2 0
zmykevin/StageDP
A two-stage RST discourse parser
zmykevin/transformers
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
Language:Python2 0
zmykevin/UNITER
Research code for "UNITER: Learning UNiversal Image-TExt Representations"
Language:Python2 0
zmykevin/vilbert_beta
Language:Jupyter Notebook2 0
zmykevin/vokenization
PyTorch code for EMNLP 2020 Paper "Vokenization: Improving Language Understanding with Visual Supervision"
Language:Python1 0
zmykevin/XLM
PyTorch original implementation of Cross-lingual Language Model Pretraining.
Language:Python2 0