vealocia's Stars
THUDM/ChatGLM2-6B
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
amueller/word_cloud
A little word cloud generator in Python
modelscope/facechain
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
facebookresearch/llama-recipes
Scripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & question answering. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment.Demo apps to showcase Llama2 for WhatsApp & Messenger
CASIA-IVA-Lab/FastSAM
Fast Segment Anything
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
spcl/graph-of-thoughts
Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"
hustvl/4DGaussians
[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
tdurieux/anonymous_github
Anonymous Github is a proxy server to support anonymous browsing of Github repositories for open-science code and data.
XueFuzhao/OpenMoE
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
QwenLM/Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
bytedance/SALMONN
SALMONN: Speech Audio Language Music Open Neural Network
eric-ai-lab/MiniGPT-5
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
OpenDriveLab/DriveLM
DriveLM: Driving with Graph Visual Question Answering
facebookresearch/fairseq2
FAIR Sequence Modeling Toolkit 2
google-deepmind/open_x_embodiment
QwenLM/qwen.cpp
C++ implementation of Qwen-LM
AILab-CVC/SEED
Official implementation of SEED-LLaMA (ICLR 2024).
OpenGVLab/all-seeing
[ICLR 2024] This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"
AILab-CVC/SEED-Bench
(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
berkeley-hipie/HIPIE
[NeurIPS2023] Code release for "Hierarchical Open-vocabulary Universal Image Segmentation"
Haian-Jin/TensoIR
[CVPR 2023] TensoIR: Tensorial Inverse Rendering
baaivision/CapsFusion
[CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale
TencentARC/ViT-Lens
[CVPR 2024] ViT-Lens: Towards Omni-modal Representations
VILA-Lab/SRe2L
(NeurIPS 2023 spotlight) Large-scale Dataset Distillation/Condensation, 50 IPC (Images Per Class) achieves the highest 60.8% on original ImageNet-1K val set.
OFA-Sys/TouchStone
Touchstone: Evaluating Vision-Language Models by Language Models
Jiayuan-Gu/hab-mobile-manipulation
Mobile manipulation in Habitat
Letian2003/C-VQA
Counterfactual Reasoning VQA Dataset
haosulab/ManiSkill2
This repo has moved to https://github.com/haosulab/ManiSkill