arnode's Stars
all-in-aigc/melodisco
AI Music Player
microsoft/muzic
Muzic: Music Understanding and Generation with Artificial Intelligence
stakira/OpenUtau
Open singing synthesis platform / Open source UTAU successor
Anjok07/ultimatevocalremovergui
GUI for a Vocal Remover that uses Deep Neural Networks.
feizc/FluxMusic
Text-to-Music Generation with Rectified Flow Transformers
run-llama/llama_index
LlamaIndex is the leading framework for building LLM-powered agents over your data.
VikParuchuri/marker
Convert PDF to markdown + JSON quickly with high accuracy
QwenLM/Qwen-Agent
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
2noise/ChatTTS
A generative speech model for daily dialogue.
MetaGLM/glm-cookbook
Examples and guides for using the GLM APIs
huggingface/lerobot
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
explosion/sense2vec
🦆 Contextually-keyed word vectors
explosion/spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python
1Panel-dev/MaxKB
💬 Ready-to-use & flexible RAG Chatbot, supporting mainstream large language models (LLMs) such as DeepSeek-R1, Llama 3.3, Qwen2, OpenAI and more.
fudan-generative-vision/champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
instantX-research/InstantID
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
MoonInTheRiver/DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
Fictionarry/ER-NeRF
[ICCV'23] Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Kedreamix/Linly-Talker
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬
lipku/LiveTalking
Real time interactive streaming digital human
OpenTalker/SadTalker
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Human3DAIGC/Make-A-Character
Official repo for Make-A-Character: High Quality Text-to-3D Character Generation within Minutes
xai-org/grok-1
Grok open release
yuqinie98/PatchTST
An offical implementation of PatchTST: "A Time Series is Worth 64 Words: Long-term Forecasting with Transformers." (ICLR 2023) https://arxiv.org/abs/2211.14730
opendilab/LightZero
[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)
zhayujie/chatgpt-on-wechat
基于大模型搭建的聊天机器人,同时支持 微信公众号、企业微信应用、飞书、钉钉 等接入,可选择GPT3.5/GPT-4o/GPT-o1/ DeepSeek/Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI,能处理文本、语音和图片,访问操作系统和互联网,支持基于自有知识库进行定制企业智能客服。
DaiShiResearch/TransNeXt
[CVPR 2024] Code release for TransNeXt model
OrionStarAI/Orion
Orion-14B is a family of models includes a 14B foundation LLM, and a series of models: a chat model, a long context model, a quantized model, a RAG fine-tuned model, and an Agent fine-tuned model. Orion-14B 系列模型包括一个具有140亿参数的多语言基座大模型以及一系列相关的衍生模型,包括对话模型,长文本模型,量化模型,RAG微调模型,Agent微调模型等。