YuzaChongyi's Stars
arcee-ai/mergekit
Tools for merging pretrained large language models.
MBZUAI-LLM/web2code
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
OpenGVLab/OmniCorpus
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
HKUST-LongGroup/CoMM
Official repository for CoMM Dataset
mlfoundations/MINT-1T
MINT-1T: A one trillion token multimodal interleaved dataset.
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
AUTOMATIC1111/stable-diffusion-webui
Stable Diffusion web UI
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
OpenBMB/MiniCPM
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
dalinvip/Awesome-ChatGPT
ChatGPT资料汇总学习,持续更新......
tencent-ailab/IP-Adapter
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
THUDM/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
baichuan-inc/Baichuan2
A series of large language models developed by Baichuan Intelligent Technology
mlfoundations/datacomp
DataComp: In search of the next generation of multimodal datasets
youngyangyang04/leetcode-master
《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
ddPn08/Radiata
Stable diffusion webui based on diffusers.
baichuan-inc/Baichuan-13B
A 13B large language model developed by Baichuan Intelligent Technology
OpenBMB/VisCPM
[ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
OpenBMB/CPM-Bee
百亿参数的中英文双语基座大模型
HenryHZY/Awesome-Multimodal-LLM
Research Trends in LLM-guided Multimodal Learning.
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
baaivision/Painter
Painter & SegGPT Series: Vision Foundation Models from BAAI
lucidrains/muse-maskgit-pytorch
Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
microsoft/torchscale
Foundation Architecture for (M)LLMs
openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.