chenllliang's Stars
KillianLucas/open-interpreter
A natural language interface for computers
waydabber/BetterDisplay
Unlock your displays on your Mac! Flexible HiDPI scaling, XDR/HDR extra brightness, virtual screens, DDC control, extra dimming, PIP/streaming, EDID override and lots more!
timothybrooks/instruct-pix2pix
CompVis/taming-transformers
Taming Transformers for High-Resolution Image Synthesis
open-mmlab/mmaction2
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
jinwchoi/awesome-action-recognition
A curated list of action recognition and related area resources
InternLM/InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
TigerResearch/TigerBot
TigerBot: A multi-language multi-task LLM
lucidrains/vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
baaivision/EVA
EVA Series: Visual Representation Fantasies from BAAI
ytongbai/LVM
deepseek-ai/DeepSeek-LLM
DeepSeek LLM: Let there be answers
openai/procgen
Procgen Benchmark: Procedurally-Generated Game-Like Gym-Environments
yunlong10/Awesome-LLMs-for-Video-Understanding
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
deepseek-ai/DeepSeek-MoE
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
MishaLaskin/vqvae
A pytorch implementation of the vector quantized variational autoencoder (https://arxiv.org/abs/1711.00937)
openai/coinrun
Code for the paper "Quantifying Transfer in Reinforcement Learning"
FMInference/H2O
[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
huggingface/open-muse
Open reproduction of MUSE for fast text2image generation.
MMMU-Benchmark/MMMU
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
RenShuhuai-Andy/TimeChat
[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
baaivision/CapsFusion
[CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale
ryanwebster90/snip-dedup
facebookresearch/EgoVLPv2
Code release for "EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone" [ICCV, 2023]
vlf-silkie/VLFeedback
liveseongho/Awesome-Video-Language-Understanding
A Survey on video and language understanding.
RenShuhuai-Andy/my-tools
my commonly-used tools
AdaCheng/EgoThink
[CVPR'24 Highlight] The official code and data for paper "EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models"
VIStA-H/GPT-4V_Social_Media
GPT-4V(ision) as A Social Media Analysis Engine
zxtan98/CProcgen