daihuangyu

Pinned Repositories

amazon-sagemaker-finetune-deploy-whisper-huggingface
This is a demo project showing how to fine-tune and deploy the Whisper model on SageMaker.
Language:Jupyter Notebook00
android
《Android Studio开发实战：从零基础到App上线》随书源码（全面添加注释版）
Language:C00
asr-server
FastCGI support for Kaldi ASR
Language:C++00
autojs-dingtalk
利用autojs进行钉钉自动打卡的脚本
Language:JavaScript00
Beamforming-for-speech-enhancement
simple delaysum, MVDR and CGMM-MVDR
Language:Python00
CSOL_TPAMI2021
Language:Python00
glide-text2im
GLIDE: a diffusion-based text-conditional image synthesis model
Language:Python10
NKF_train
NKF training
Language:Python12
speex_aec_kf
speex aec kalman filter
Language:Python5 1 03
webrtc_tde
webrtc3 aec tde
Language:Python11

daihuangyu's Repositories

daihuangyu/speex_aec_kf
speex aec kalman filter
Language:Python5 1 03
daihuangyu/glide-text2im
GLIDE: a diffusion-based text-conditional image synthesis model
Language:Python10
daihuangyu/NKF_train
NKF training
Language:Python12
daihuangyu/webrtc_tde
webrtc3 aec tde
Language:Python11
daihuangyu/DALL-E
PyTorch package for the discrete VAE used for DALL·E.
Language:Python00
daihuangyu/DALLE2-pytorch
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
Language:Python00
daihuangyu/dsi
Do Something Interesting缩写，做一些有趣的事
daihuangyu/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
daihuangyu/faster-whisper
Faster Whisper transcription with CTranslate2
daihuangyu/FasterTransformer
Transformer related optimization, including BERT, GPT
daihuangyu/FunASR
A Fundamental End-to-End Speech Recognition Toolkit
daihuangyu/GPT2-chitchat
GPT2 for Chinese chitchat/用于中文闲聊的GPT2模型(实现了DialoGPT的MMI**)
daihuangyu/InferLLM
a lightweight LLM model inference framework
daihuangyu/JARVIS
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
daihuangyu/kenlm
KenLM: Faster and Smaller Language Model Queries
daihuangyu/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence BLip
daihuangyu/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
daihuangyu/so-vits-svc
SoftVC VITS Singing Voice Conversion
daihuangyu/stable-diffusion-webui
Stable Diffusion web UI
daihuangyu/street-fighter-ai
This is an AI agent for Street Fighter II Champion Edition.
daihuangyu/swift
Use PEFT or Full-parameter to fine-tuning LLMs or MLLMs
daihuangyu/SwissArmyTransformer
SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.
daihuangyu/TTS-1
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
daihuangyu/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Can be trained on a single GPU!
daihuangyu/VITS-fast-fine-tuning
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
daihuangyu/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
daihuangyu/Whisper-1
High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model
daihuangyu/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
daihuangyu/ZJU-nCov-Hitcarder-1
Language:Python
daihuangyu/ZJU-nCov-Hitcarder-Sample
Sample of https://github.com/Long0x0/ZJU-nCov-Hitcarder.
0 01