robotnc's Stars
Stability-AI/stablediffusion
High-Resolution Image Synthesis with Latent Diffusion Models
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
DayBreak-u/chineseocr_lite
超轻量级中文ocr,支持竖排文字识别, 支持ncnn、mnn、tnn推理 ( dbnet(1.8M) + crnn(2.5M) + anglenet(378KB)) 总模型仅4.7M
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
MoonInTheRiver/DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
facebookresearch/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
lucidrains/audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
TigerResearch/TigerBot
TigerBot: A multi-language multi-task LLM
lifeiteng/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
qiuqiangkong/audioset_tagging_cnn
Edresson/YourTTS
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
dulaiduwang003/TIME-SEA-chatgpt
基于SpringBoot3开发的Ai平台 含双端 网页以及小程序 包含各类Ai模型 和绘图 ,含支付 双端数据同步 支持自定义预设词,功能板块定义 web兼容手机展示
keonlee9420/DiffGAN-TTS
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
KevinMIN95/StyleSpeech
Official implementation of Meta-StyleSpeech and StyleSpeech
openasic-org/xk265
xk265:HEVC/H.265 Video Encoder IP Core (RTL)
hyperconnect/TC-ResNet
Code for Temporal Convolution for Real-time Keyword Spotting on Mobile Devices
keonlee9420/StyleSpeech
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
Enny1991/beamformers
Easy to use Beamformers for multi-channel speech separation/enhancement
WelkinYang/GradTTS
Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"
harvard-edge/multilingual_kws
Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpus
wolverinn/HEVC-CU-depths-prediction-CNN
Using convolutional neural networks to predict the Coding Units (CUs) depths in HEVC intra-prediction mode, in order to reduce the time of the encoding process in HEVC.
TeamPyOgg/PyOgg
Simple OGG Vorbis, Opus and FLAC bindings for Python
katsugeneration/tensor-fsmn
Feedforward Sequential Memory Networks (FSMN) implemented by tensorflow
YiwenShaoStephen/pychain_example
Edresson/Coqui-TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
yinruiqing/fsmn
Feedforward Sequential Memory Networks
xiaoli1996/SSBPR
d5555/FSMN
pytorch FSMN