pzl1744's Stars
2dust/v2rayN
A GUI client for Windows, Linux and macOS, support Xray core and sing-box-core and others
hiroi-sora/Umi-OCR
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
MoonInTheRiver/DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
williamyang1991/VToonify
[SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer
yoyo-nb/Thin-Plate-Spline-Motion-Model
[CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.
project-baize/baize-chatbot
Let ChatGPT teach your own chatbot in hours with a single GPU!
openvpi/DiffSinger
An advanced singing voice synthesis system with high fidelity, expressiveness, controllability and flexibility based on DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
salu133445/musegan
An AI for Music Generation
williamyang1991/DualStyleGAN
[CVPR 2022] Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer
magenta/mt3
MT3: Multi-Task Multitrack Music Transcription
apple/ml-neuman
Official repository of NeuMan: Neural Human Radiance Field from a Single Video (ECCV 2022)
NVIDIA/mellotron
Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
marcoppasini/musika
Fast Infinite Waveform Music Generation
Jittor/JNeRF
JNeRF is a NeRF benchmark based on Jittor. JNeRF re-implemented instant-ngp and achieved same performance with original paper.
leimao/Voice-Converter-CycleGAN
Voice Converter Using CycleGAN and Non-Parallel Data
SforAiDl/Neural-Voice-Cloning-With-Few-Samples
This repository has implementation for "Neural Voice Cloning With Few Samples"
KinglittleQ/GST-Tacotron
A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
deterministic-algorithms-lab/Cross-Lingual-Voice-Cloning
Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.
keonlee9420/DiffGAN-TTS
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Sharad24/Neural-Voice-Cloning-with-Few-Samples
Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu
AIFSH/NativeSpeaker
make your Speaker talking as Native style with own voice!
CMsmartvoice/One-Shot-Voice-Cloning
:relaxed: One Shot Voice Cloning base on Unet-TTS
Edresson/VoiceSplit
VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram
Seanseattle/StyleSwap
StyleSwap: Style-Based Generator Empowers Robust Face Swapping (ECCV 2022)
PlayVoice/VI-SVS
Singing Voice Synthesis based on VITS, different from VISinger
SMART-TTS/SMART-Single_Emotional_TTS
foamliu/Tacotron2-Mandarin
PyTorch reimplementation of Tacotron2 in Mandarin
MingtaoGuo/StyleSwap
Unofficial implementation of the paper: StyleSwap: Style-Based Generator Empowers Robust Face Swapping
zawawiAI/yolo_gpt
This is a GUI application that integrates YOLOv8 object recognition with OpenAI's GPT-3 language generation model.