YongLD's Stars
OpenInterpreter/open-interpreter
A natural language interface for computers
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
deepset-ai/haystack
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
e2b-dev/awesome-ai-agents
A list of AI autonomous agents
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
fudan-generative-vision/champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
yisol/IDM-VTON
[ECCV2024] IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild
OpenBMB/BMTools
Tool Learning for Big Models, Open-Source Solutions of ChatGPT-Plugins
PhoebusSi/Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!
THUDM/AgentBench
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
baaivision/Emu
Emu Series: Generative Multimodal Models from BAAI
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
PKU-YuanGroup/MagicTime
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
0nutation/SpeechGPT
SpeechGPT Series: Speech Large Language Models
BAAI-DCAI/Bunny
A family of lightweight multimodal models.
mli/transformers-benchmarks
real Transformer TeraFLOPS on various GPUs
datadreamer-dev/DataDreamer
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
LTH14/rcg
PyTorch implementation of RCG https://arxiv.org/abs/2312.03701
dvlab-research/LLaMA-VID
LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)
Event-AHU/Mamba_State_Space_Model_Paper_List
[Mamba-Survey-2024] Paper list for State-Space-Model/Mamba and it's Applications
EmulationAI/awesome-large-audio-models
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
Genesis-Embodied-AI/RoboGen
A generative and self-guided robotic agent that endlessly propose and master new skills.
LeapLabTHU/Agent-Attention
Official repository of Agent Attention (ECCV2024)
showlab/DragAnything
[ECCV 2024] DragAnything: Motion Control for Anything using Entity Representation
robopen/roboagent
Repository to train and evaluate RoboAgent
Marker-Inc-Korea/RAGchain
Extension of Langchain for RAG. Easy benchmarking, multiple retrievals, reranker, time-aware RAG, and so on...
yudianzheng/SketchVideo
[EG 2023] Sketch Video Synthesis
codefuse-ai/CodeFuse-MFT-VLM
thecharm/BDoG
Code for ACM MM 2024 paper "A Picture Is Worth a Graph: A Blueprint Debate Paradigm for Multimodal Reasoning"