tfka's Stars
DirtyHarryLYL/LLM-in-Vision
Recent LLM-based CV and related works. Welcome to comment/contribute!
THUDM/WebGLM
WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)
google/break-a-scene
Official implementation for "Break-A-Scene: Extracting Multiple Concepts from a Single Image" [SIGGRAPH Asia 2023]
Alibaba-NLP/EcomGPT
An Instruction-tuned Large Language Model for E-commerce
neulab/prompt2model
prompt2model - Generate Deployable Models from Natural Language Instructions
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
DCDmllm/Cheetah
chatchat-space/Langchain-Chatchat
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain
yanqiangmiffy/Chinese-LangChain
中文langchain项目|小必应,Q.Talk,强聊,QiangTalk
langchain-ai/langchain
🦜🔗 Build context-aware reasoning applications
jingyi0000/VLM_survey
Collection of AWESOME vision-language models for vision tasks
salesforce/DialogStudio
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection and Instruction-Aware Models for Conversational AI
AILab-CVC/SEED
Official implementation of SEED-LLaMA (ICLR 2024).
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
OSU-NLP-Group/Mind2Web
[NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web"
HazyResearch/TART
TART: A plug-and-play Transformer module for task-agnostic reasoning
Victorwz/LongMem
Official implementation of our NeurIPS 2023 paper "Augmenting Language Models with Long-Term Memory".
SHI-Labs/Matting-Anything
Matting Anything Model (MAM), an efficient and versatile framework for estimating the alpha matte of any instance in an image with flexible and interactive visual or linguistic user prompt guidance.
xinyu1205/recognize-anything
Open-source and strong foundation image recognition models.
bojone/NBCE
Naive Bayes-based Context Extension
dome272/Wuerstchen
Official implementation of Würstchen: Efficient Pretraining of Text-to-Image Models
wade3han/champagne
An official codebase for paper ":champagne: CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos (ICCV 23)"
HenryHZY/Awesome-Multimodal-LLM
Research Trends in LLM-guided Multimodal Learning.
OFA-Sys/ExpertLLaMA
An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.
cg1177/VideoLLM
VideoLLM: Modeling Video Sequence with Large Language Models
XingangPan/DragGAN
Official Code for DragGAN (SIGGRAPH 2023)
aiwaves-cn/RecurrentGPT
Official Code for Paper: RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text
YBYBZhang/ControlVideo
[ICLR 2024] Official pytorch implementation of "ControlVideo: Training-free Controllable Text-to-Video Generation"
mit-han-lab/fastcomposer
[IJCV] FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
silverriver/MMChat
[LREC] MMChat: Multi-Modal Chat Dataset on Social Media