silyfox

Pinned Repositories

AdaIN-style
Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization
Language:Lua00
Cross-Speaker-Emotion-Transfer
PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
Language:Python10
DiffusionVAE
Language:Python00
dive_into_deep_learning
✔️李沐【动手学深度学习】课程学习笔记：使用pycharm编程，基于pytorch框架实现。
Language:Python20
dl-for-emo-tts
:computer: :robot: A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech :speaker:
Language:Jupyter Notebook21
fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Language:Python10
naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
Language:Python10
SDEdit
PyTorch implementation for SDEdit: Image Synthesis and Editing with Stochastic Differential Equations
Language:Python10
unilm
UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities
Language:Python10
vae
a simple vae and cvae from keras
Language:Python10

silyfox's Repositories

silyfox/naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
Language:Python10
silyfox/annotated_deep_learning_paper_implementations
🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
silyfox/ChatTTS
A generative speech model for daily dialogue.
silyfox/ControlSpeech
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec
silyfox/CVQ-VAE
[ICCV 2023] Online Clustered Codebook
silyfox/DiffVar
silyfox/e2-tts-pytorch
Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
silyfox/EmoSphere-TTS
The official implementation of EmoSphere-TTS
silyfox/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
silyfox/FAcodec
Training code for FAcodec presented in NaturalSpeech3
silyfox/fry_course_materials
范仁义录播课资料，会依次推出各种完全免费的前端、后端、大数据、人工智能等课程，课程网站： https://fanrenyi.com ； b站课程地址： https://space.bilibili.com/45664489 ；
silyfox/FunCodec
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
silyfox/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
silyfox/improved-diffusion
Release for Improved Denoising Diffusion Probabilistic Models
silyfox/kmeans_pytorch
kmeans using PyTorch
silyfox/MassTTS
a TTS demo for training new characters.
silyfox/OpenVoice
Instant voice cloning by MyShell.
silyfox/parler-tts
Inference and training library for high-quality TTS models.
silyfox/py-webrtcvad
Python interface to the WebRTC Voice Activity Detector
silyfox/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
silyfox/representation-space-info-comparison
Code accompanying "Comparing information content of representation spaces for disentanglement with VAE ensembles"
silyfox/SECap
silyfox/SpeechGen
《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》
silyfox/StableTTS
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
silyfox/Style-Bert-VITS2
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles.
silyfox/tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
silyfox/ustyle
Language:HTML
silyfox/X-E-Speech-code
X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion
silyfox/XJTU-thesis
西安交通大学学位论文模板（LaTeX）（适用硕士、博士学位）An official LaTeX template for Xi'an Jiaotong University degree thesis (Chinese and English)
silyfox/XTTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production