ytzhangscr's Stars
xinsir6/ControlNetPlus
ControlNet++: All-in-one ControlNet for image generations and editing!
AIVFI/Monocular-Depth-Estimation-Rankings-and-2D-to-3D-Video-Conversion-Rankings
Rankings include: BetterDepth Depth Anything DPT FutureDepth GBDMF GenPercept GeoWizard LeReS LightedDepth LFVRT Marigold Metric3D MiDaS NeWCRFs PatchFusion UniDepth ZoeDepth
datawhalechina/intro-mathmodel
《数学建模导论》教程,全网最全数学建模模型与算法教程系列,带你走进数学建模的大门!
YangLing0818/VideoTetris
[NeurIPS 2024] VideoTetris: Towards Compositional Text-To-Video Generation
X-LANCE/AniTalker
[ACM MM 2024] This is the official code for "AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding"
modelscope/FunClip
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
replicate/cog
Containers for machine learning
Lxiangyue/GenN2N
[CVPR'24 - Rebuttal Score 554] GenN2N: Generative NeRF2NeRF Translation
mshumer/gpt-prompt-engineer
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
aishwaryanr/awesome-generative-ai-guide
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
HumanAIGC/AnimateAnyone
Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
jasonppy/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
All-Hands-AI/OpenHands
🙌 OpenHands: Code Less, Make More
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
marcelscruz/public-apis
A collaborative list of public APIs for developers
HeyPuter/puter
🌐 The Internet OS! Free, Open-Source, and Self-Hostable.
ollama/ollama
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
Hillobar/Rope
GUI-focused roop
microsoft/UFO
A UI-Focused Agent for Windows OS Interaction.
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
mckaywrigley/chatbot-ui
Come join the best place on the internet to learn AI skills. Use code "chatbotui" for an extra 20% off.
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
dvmazur/mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
fishaudio/fish-speech
Brand new TTS solution
ansible/ansible
Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy and maintain. Automate everything from code deployment to network configuration to cloud management, in a language that approaches plain English, using SSH, with no agents to install on remote systems. https://docs.ansible.com.
Ucas-HaoranWei/Vary
[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.
AILab-CVC/UniRepLKNet
[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
geekan/MetaGPT
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming