MikeWangWZHL
CS Phd at UIUC, Research Assistant at BLENDER lab advised by Prof. Heng Ji | Intern at Tencent AI lab | Intern at MSRA
UIUCChampaign, Illinois
Pinned Repositories
acl-anthology
Data and software for building the ACL Anthology.
Aida_COVID
Repo for Aida Covid Hackathon src
EEG-To-Text
code for AAAI2022 paper "Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot Sentiment Classification"
Multitask-Finetuning_CLIP
Code for paper "Rethinking Task Sampling for Few-shot Vision-Language Transfer Learning" COLING 2022 workshop
Paxion
Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight
Solo-Performance-Prompting
Repo for paper "Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration"
VDLM
Repo for paper: Text-based Reasoning About Vector Graphics
VidIL
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Wikinews_Pipeline
Get news from Wikipedia page's reference section
Zemi
Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings
MikeWangWZHL's Repositories
MikeWangWZHL/Solo-Performance-Prompting
Repo for paper "Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration"
MikeWangWZHL/EEG-To-Text
code for AAAI2022 paper "Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot Sentiment Classification"
MikeWangWZHL/VidIL
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
MikeWangWZHL/Paxion
Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight
MikeWangWZHL/VDLM
Repo for paper: Text-based Reasoning About Vector Graphics
MikeWangWZHL/Zemi
Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings
MikeWangWZHL/Multitask-Finetuning_CLIP
Code for paper "Rethinking Task Sampling for Few-shot Vision-Language Transfer Learning" COLING 2022 workshop
MikeWangWZHL/Wikinews_Pipeline
Get news from Wikipedia page's reference section
MikeWangWZHL/MikeWangWZHL.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
MikeWangWZHL/1d-tokenizer
This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation
MikeWangWZHL/alfworld-docker-setup
MikeWangWZHL/Cutie
[CVPR 2024 Highlight] Putting the Object Back Into Video Object Segmentation
MikeWangWZHL/diffusers
š¤ Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
MikeWangWZHL/Grounded-Segment-Anything
Grounded-SAM: Marrying Grounding-DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
MikeWangWZHL/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
MikeWangWZHL/LaVIT
LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
MikeWangWZHL/LLaVA
[NeurIPS 2023 Oral] Visual Instruction Tuning: LLaVA (Large Language-and-Vision Assistant) built towards multimodal GPT-4 level capabilities.
MikeWangWZHL/MathVista
MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts
MikeWangWZHL/maze-dataset
maze datasets for investigating OOD behavior of ML systems
MikeWangWZHL/MiniGPT4-video
MikeWangWZHL/parti-pytorch
Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch
MikeWangWZHL/rq-vae-transformer
The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)
MikeWangWZHL/sam-hq
Segment Anything in High Quality [NeurIPS 2023]
MikeWangWZHL/self-refine
LLMs can generate feedback on their work, use it to improve the output, and repeat this process iteratively.
MikeWangWZHL/singularity
Official PyTorch code for Singularity model in the paper "Revealing Single Frame Bias for Video-and-Language Learning"
MikeWangWZHL/Tracking-Anything-with-DEVA
Forked from paper [ICCV 2023] Tracking Anything with Decoupled Video Segmentation
MikeWangWZHL/VAR
[GPT beats diffusionš„] [scaling laws in visual generationš] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
MikeWangWZHL/Video-ChatGPT
"Video-ChatGPT" is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
MikeWangWZHL/viper
Code for the paper "ViperGPT: Visual Inference via Python Execution for Reasoning"
MikeWangWZHL/VQGAN-LC