xhyandwyy

Multimodal mPLUG.

Alibaba DAMO AcademyHangzhou, China

xhyandwyy's Stars

All-Hands-AI/OpenHands
🙌 OpenHands: Code Less, Make More
Language:Python39.2k 321 1.9k4.4k
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Language:Python36.8k 219 5.6k4.5k
microsoft/autogen
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
Language:Python36.4k 417 2.2k5.3k
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python27.6k 230 2733.2k
stanford-oval/storm
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
Language:Python13.8k 100 1521.3k
e2b-dev/awesome-ai-agents
A list of AI autonomous agents
12.4k 224 34922
axolotl-ai-cloud/axolotl
Go ahead and axolotl questions
Language:Python8.2k 45 710898
apple/corenet
CoreNet: A library for training deep neural networks
Language:Jupyter Notebook7k 66 21543
Ucas-HaoranWei/GOT-OCR2.0
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Language:Python6.4k 55 189561
pywinauto/pywinauto
Windows GUI Automation with Python (based on text properties)
Language:Python5.1k 167 971701
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Language:Python4.9k 40 1.6k445
modelscope/ms-swift
Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, ...) or 100+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2.5, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL2, Phi3.5-Vision, GOT-OCR2, ...).
Language:Python4.8k 23 1.5k418
AILab-CVC/VideoCrafter
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Language:Python4.6k 71 84347
google-research/big_vision
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
Language:Jupyter Notebook2.5k 39 61163
facebookresearch/chameleon
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
Language:Python1.9k 26 51113
ytongbai/LVM
Language:Python1.8k 117 2455
Vchitect/Latte
Latte: Latent Diffusion Transformer for Video Generation.
Language:Python1.7k 23 106180
aigc-apps/EasyAnimate
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
Language:Python1.6k 20 120118
landing-ai/vision-agent
Vision agent
Language:Python1.6k 24 16189
AlibabaResearch/AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Language:C++1.6k 39 188182
PKU-YuanGroup/MagicTime
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Language:Python1.3k 21 29125
mini-sora/minisora
MiniSora: A community aims to explore the implementation path and future development direction of Sora.
Language:Python1.2k 19 64151
Vchitect/SEINE
[ICLR 2024] SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
Language:Python922 25 3066
kyegomez/ScreenAI
Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"
Language:Python307 9 630
TIGER-AI-Lab/Mantis
Official code for Paper "Mantis: Multi-Image Instruction Tuning" (TMLR2024)
Language:Python191 9 2115
AILab-CVC/Make-Your-Video
[IEEE TVCG 2024] Customized Video Generation Using Textual and Structural Guidance
Language:Python186 16 28
zjunlp/KnowAgent
KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents
Language:Python183 6 716
OpenGVLab/GUI-Odyssey
GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 201 apps, and 1.4K app combos.
Language:Python76 3 94
bytarnish/AGILE
Language:Python57 1 03
X-PLUG/MM_StoryAgent
Language:Python7 0 0

xhyandwyy

xhyandwyy's Stars

All-Hands-AI/OpenHands

hiyouga/LLaMA-Factory

microsoft/autogen

meta-llama/llama3

stanford-oval/storm

e2b-dev/awesome-ai-agents

axolotl-ai-cloud/axolotl

apple/corenet

Ucas-HaoranWei/GOT-OCR2.0

pywinauto/pywinauto

InternLM/lmdeploy

modelscope/ms-swift

AILab-CVC/VideoCrafter

google-research/big_vision

facebookresearch/chameleon

ytongbai/LVM

Vchitect/Latte

aigc-apps/EasyAnimate

landing-ai/vision-agent

AlibabaResearch/AdvancedLiterateMachinery

PKU-YuanGroup/MagicTime

mini-sora/minisora

Vchitect/SEINE

kyegomez/ScreenAI

TIGER-AI-Lab/Mantis

AILab-CVC/Make-Your-Video

zjunlp/KnowAgent

OpenGVLab/GUI-Odyssey

bytarnish/AGILE

X-PLUG/MM_StoryAgent