wizardhunter's Stars
Significant-Gravitas/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
TencentARC/GFPGAN
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
OpenBMB/ChatDev
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
microsoft/JARVIS
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
princeton-nlp/SWE-agent
SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
facebookresearch/AnimatedDrawings
Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
modelscope/facechain
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
OpenBMB/XAgent
An Autonomous LLM Agent for Complex Task Solving
THUDM/GLM-130B
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
princeton-vl/infinigen
Infinite Photorealistic Worlds using Procedural Generation
OpenBMB/ToolBench
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Zejun-Yang/AniPortrait
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
timesler/facenet-pytorch
Pretrained Pytorch face detection (MTCNN) and facial recognition (InceptionResnet) models
NVlabs/neuralangelo
Official implementation of "Neuralangelo: High-Fidelity Neural Surface Reconstruction" (CVPR 2023)
PetoiCamp/OpenCat
An open source quadruped robot pet framework for developing Boston Dynamics-style four-legged robots that are perfect for STEM, coding & robotics education, IoT robotics applications, AI-enhanced robotics application services, research, and DIY robotics kit development.
facebookresearch/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
facebookresearch/audio2photoreal
Code and dataset for photorealistic Codec Avatars driven from audio
ZiqiaoPeng/SyncTalk
[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"
aipixel/GPS-Gaussian
[CVPR 2024 Highlight] The official repo for “GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis”
synthesiaresearch/humanrf
Official code for "HumanRF: High-Fidelity Neural Radiance Fields for Humans in Motion"
Stanford-TML/EDGE
Official PyTorch Implementation of EDGE (CVPR 2023)
DiffPoseTalk/DiffPoseTalk
DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
tobias-kirschstein/nersemble
[Siggraph '23] NeRSemble: Neural Radiance Field Reconstruction of Human Heads
heyuanYao-pku/MoConVQ