ishine

speech asr/speech-recognition tts/text-to-speech vc/voice-conversion

gerzz.incshanghai

Pinned Repositories

AIF-PyTorch
(NOT Official) Implementation Auto-regressive Integrate-and-Fire (AIF)
Language:Python2 0 00
cmas-sample
a simple sample to illustrate circular microphone array separator based on beamforming
Language:C++30
ContextNet
Tensorflow2 based implementation of ContextNet, an improved convolutional rnn-transducer-based architecture for end-to-end speech recognition using global context
Language:Python16 2 014
kaldi
This is now the official location of the Kaldi project.
Language:Shell20
knn-vc
Voice conversion with just k-nearest neighbors
Language:Python42
lwnn
Lightweight Neural Network
Language:C35
mlas
Language:Assembly20
Project_sp_ehance_matlab
Language:MATLAB94
VALL-E-X-Trainer
VALL-E-X-Trainer
Language:Jupyter Notebook76
vc-lm
将任意人的音色转换为成千上万种不同音色
Language:Python18 1 024

ishine's Repositories

ishine/kaldi
This is now the official location of the Kaldi project.
Language:Shell20
ishine/stac-speech-translation
Language:Python1
ishine/auorange
Audio LPC (linear prediction code) using mel spectorgram, compatible for LPCNet
ishine/bark.cpp
Port of Suno AI's Bark in C/C++ for fast inference
Language:C++
ishine/Bert-VITS2
vits2 backbone with bert
Language:Python
ishine/ChatTTS
TTS
ishine/CoMoSpeech
Language:Python
ishine/DiSeg
Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"
Language:Python0 0
ishine/dynamic-window-speechformer
Language:Python
ishine/espnet
End-to-End Speech Processing Toolkit
Language:Python
ishine/FunASR
A Fundamental End-to-End Speech Recognition Toolkit
Language:Python1
ishine/GenTranslate
Code for paper "GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators"
Language:Python
ishine/Lightvoc
LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM
Language:Jupyter Notebook
ishine/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Language:Python
ishine/noisereduce
Noise reduction in python using spectral gating (speech, bioacoustics, audio, time-domain signals)
ishine/Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
Language:Python
ishine/rwkv.cpp
INT4 and FP16 inference on CPU for RWKV language model
Language:C++
ishine/SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
Language:Python
ishine/STAR-Adapt
Code for paper "Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models"
ishine/stream-vc
An unofficial PyTorch implementation of the StreamVC(Real-Time Low-Latency Voice Conversion)
Language:Python
ishine/StreamingSpeakerDiarization
Official open source implementation of the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"
Language:Python
ishine/StreamVC
An unofficial pytorch implementation of "STREAMVC: REAL-TIME LOW-LATENCY VOICE CONVERSION".
Language:Python
ishine/TeleSpeech-ASR
ishine/tinyvc
a lightweight voice conversion
Language:Python
ishine/tortoise.cpp
Language:C++
ishine/UMOE-Scaling-Unified-Multimodal-LLMs
The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"
Language:Python
ishine/valle
Zero-Shot Text-To-Speech
Language:Python
ishine/vallex-webui
An open source implementation of Microsoft's VALL-E X zero-shot TTS model
Language:Python
ishine/Whispering-LLaMA
EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction
Language:Jupyter Notebook
ishine/X-E-Speech-code
X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion
Language:Python