Pinned Repositories
acdemic
Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
android_nfc_book
A repo for code samples in my android-nfc-book
assimp
Official Open Asset Import Library Repository. Loads 40+ 3D file formats into one unified and clean data structure.
awesome-deep-learning-music
List of articles related to deep learning applied to music
interviewforprogrammers
maoyan
shiyanba
stable-diffusion-webui
Stable Diffusion web UI
tacotronv2_wavernn_chinese
tacotronV2 + wavernn 实现中文语音合成(Tensorflow + pytorch)
daxiangpanda's Repositories
daxiangpanda/stable-diffusion-webui
Stable Diffusion web UI
daxiangpanda/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
daxiangpanda/audiocraft_plus
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
daxiangpanda/bark-training-cloning
for training the model
daxiangpanda/carefree-creator
An AI-powered creator for everyone.
daxiangpanda/CLAP
Contrastive Language-Audio Pretraining
daxiangpanda/DiffSinger
PyTorch Implementation of DiffSinger: Diffusion Acoustic Model for Singing Voice Synthesis (TTS Extension)
daxiangpanda/DiffSinger-1
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Forked and maintained by the OpenVPI community
daxiangpanda/disable-flutter-tls-verification
A Frida script that disables Flutter's TLS verification
daxiangpanda/dream-textures
Stable Diffusion built-in to the Blender shader editor
daxiangpanda/facechain
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
daxiangpanda/FluxMusic
Text-to-Music Generation with Rectified Flow Transformers
daxiangpanda/Games
Home Page Link:
daxiangpanda/lobe-chat
🤖 Lobe Chat - an open-source, high-performance chatbot framework that supports speech synthesis, multimodal, and extensible Function Call plugin system. Supports one-click free deployment of your private ChatGPT/LLM web application.
daxiangpanda/MDM
MDM
daxiangpanda/metahuman-stream
Real time interactive streaming digital human
daxiangpanda/midi-js-soundfonts
Pre-rendered General MIDI soundfonts that can be used immediately with MIDI.js
daxiangpanda/muzic
Muzic: Music Understanding and Generation with Artificial Intelligence
daxiangpanda/OpenVoice
Instant voice cloning by MyShell.
daxiangpanda/PaddleSpeech
Easy-to-use Speech Toolkit including SOTA/Streaming ASR witch punctuation, influential TTS with text frontend, Speaker Verification System and End-to-End Speech Simultaneous Translation.
daxiangpanda/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
daxiangpanda/ppg-vc
PPG-Based Voice Conversion
daxiangpanda/python_template
daxiangpanda/roop
one-click deepfake (face swap)
daxiangpanda/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
daxiangpanda/so-vits-svc
SoftVC VITS Singing Voice Conversion
daxiangpanda/UniAudio
The Open Source Code of UniAudio
daxiangpanda/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
daxiangpanda/vits
VITS implementation of Japanese, Chinese, Korean, Sanskrit and Thai
daxiangpanda/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit