pingguomaggie's Stars
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
openai/swarm
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
modelscope/agentscope
Start building LLM-empowered multi-agent applications in an easier way.
comet-ml/opik
Open-source end-to-end LLM Development Platform
OptimalScale/LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
SmartFlowAI/TheGodOfCookery
opea-project/GenAIExamples
Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open Platform for Enterprise AI (OPEA) project.
luogen1996/LaVIN
[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
Institute4FutureHealth/CHA
Conversational Health Agents: A Personalized LLM-powered Agent Framework
Atomic-man007/Awesome_Multimodel_LLM
Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Models (MLLM). It covers datasets, tuning techniques, in-context learning, visual reasoning, foundational models, and more. Stay updated with the latest advancement.
InternLM/InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
SCUTlihaoyu/open-chat-video-editor
Open source short video automatic generation tool
showlab/Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
InterDigitalInc/CompressAI
A PyTorch library and evaluation platform for end-to-end compression research
HoangTrinh/ROI_Online_Meeting_Codec
The official source code for RCLC: ROI-based joint conventional and learning video compression
showlab/Image2Paragraph
[A toolbox for fun.] Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.
ExponentialML/Video-BLIP2-Preprocessor
A simple script that reads a directory of videos, grabs a random frame, and automatically discovers a prompt for it
gorjanradevski/text2atlas
Codebase for "Learning to ground medical text in a 3D human atlas (CoNLL 2020)".
cambridgeltl/visual-med-alpaca
Visual Med-Alpaca is an open-source, multi-modal foundation model designed specifically for the biomedical domain, built on the LLaMa-7B.
IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
deepaknlp/MedVidQACL
Implementation of the Benchmark Approaches for Medical Instructional Video Classification (MedVidCL) and Medical Video Question Answering (MedVidQA)
baaivision/Painter
Painter & SegGPT Series: Vision Foundation Models from BAAI
mrebol/Gestures-From-Speech
openhuman-ai/awesome-gesture_generation
Awesome Gesture Generation
ShenhanQian/SpeechDrivesTemplates
[ICCV 2021] The official repo for the paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates".
alvinliu0/HA2G
[CVPR 2022] Code for "Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation"
Advocate99/DiffGesture
[CVPR'2023] Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation