Shuo-H

CMUPitts, PA

Shuo-H's Stars

microsoft/AudioEntailment
Audio Entailment: Deductive Reasoning for Audio Understanding
101
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
Language:Jupyter Notebook13.2k1.1k
QwenLM/Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
Language:Python1.4k105
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
Language:Python68.9k8.1k
mli/paper-reading
深度学习经典、新论文逐段精读
26.5k2.4k
KonanAI/konanai
Language:Python1
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language:Python133k26.5k
Audio-AGI/AudioSep
Official implementation of "Separate Anything You Describe"
Language:Python1.6k115
etzinis/unsup_speech_enh_adaptation
Unsupervised domain adaptation for conversational speech enhancement using RemixIT
Language:Jupyter Notebook515
konan-ai/konanai-archive
Language:Python21
espeak-ng/espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Language:C4.1k887
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Language:Python11.8k1.2k
facebookresearch/ImageBind
ImageBind One Embedding Space to Bind Them All
Language:Python8.3k758
MLSpeech/FormantsTracker
Language:Python105
google-research/tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
26.6k2.2k
YunyangZeng/TAPLoss
Language:Python6312

Shuo-H

Shuo-H's Stars

microsoft/AudioEntailment

naklecha/llama3-from-scratch

QwenLM/Qwen-Audio

openai/whisper

mli/paper-reading

KonanAI/konanai

huggingface/transformers

Audio-AGI/AudioSep

etzinis/unsup_speech_enh_adaptation

konan-ai/konanai-archive

espeak-ng/espeak-ng

m-bain/whisperX

facebookresearch/ImageBind

MLSpeech/FormantsTracker

google-research/tuning_playbook

YunyangZeng/TAPLoss