Fuann

National Taiwan Normal UniversityTaipei, Taiwan

Fuann's Stars

state-spaces/mamba
Mamba SSM architecture
Language:Python13.6k 99 5781.2k
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Language:Python13.1k 137 7341.4k
Lightning-AI/litgpt
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Language:Python11k 98 8171.1k
modelscope/ms-swift
Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, ...) or 100+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2.5, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL2, Phi3.5-Vision, GOT-OCR2, ...).
Language:Python4.8k 23 1.5k422
s3prl/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
Language:Python2.3k 46 399486
QwenLM/Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
Language:Python1.5k 26 67111
QwenLM/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
Language:Python1.3k 33 8792
stanfordnlp/pyreft
ReFT: Representation Finetuning for Language Models
Language:Python1.3k 18 94111
yousinix/portfolYOU
A beautiful portfolio Jekyll theme that works with GitHub Pages.
Language:HTML1k 17 70601
k2-fsa/icefall
Language:Python970 48 685308
DmitryRyumin/INTERSPEECH-2023-24-Papers
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
650 89 442
santi-pdp/pase
Problem Agnostic Speech Encoder
Language:Python440 22 4687
jonatasgrosman/huggingsound
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
Language:Python438 14 4844
carlthome/python-audio-effects
Apply audio effects such as reverb and EQ directly to audio files or NumPy ndarrays.
Language:Python386 12 2352
oliverguhr/wav2vec2-live
A live speech recognition using Facebooks wav2vec 2.0 model.
Language:Python333 7 1556
PolyAI-LDN/pheme
Language:Python254 11 2025
marekrei/sequence-labeler
Neural network sequence labeling model
Language:Python252 15 1274
YuanGongND/gopt
Code for the ICASSP 2022 paper "Transformer-Based Multi-Aspect Multi-Granularity Non-native English Speaker Pronunciation Assessment".
Language:Python156 5 3528
JusperLee/SPMamba
Language:Python142 4 2017
ga642381/SpeechPrompt
**Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speech processing with prompting paradigm
Language:Python97 6 38
aleXiehta/PhoneFortifiedPerceptualLoss
Improving Perceptual Quality by Phone-Fortified Perceptual Loss using Wasserstein Distance for Speech Enhancement
Language:Python76 4 517
lstrgar/self-supervised-phone-segmentation
Phoneme segmentation using pre-trained speech models
Language:Python54 5 510
archiki/Robust-E2E-ASR
This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 2021.
Language:Python46 3 610
articulatory/articulatory
Deep Articulatory Synthesis and Inversion
Language:Python42 5 23
JazminVidal/gop-dnn-epadb
Goodness of Pronunciation using Kaldi on Epa-DB database
Language:Python33 4 75
Observeai-Research/Phoneme-BERT
33 2 21
JuanPZuluaga/accent-recog-slt2022
Repository for Accent Recognition (Hackathon @SLT2022)
Language:Jupyter Notebook23 5 17
juice500ml/dysarthria-gop
Language:Python20 2 23
doheejin/SB_loss_PA
This repository is the implementation of the paper, "Score-balanced Loss for Multi-aspect Pronunciation Assessment" (Interspeech 2023).
Language:Python17 1 11
hcraighead/automated-english-transcription-grader
Investigating the effect of auxiliary objectives for the automated grading of learner English speech transcriptions (ACL 2020)
Language:Python7 3 03

Fuann

Fuann's Stars

state-spaces/mamba

m-bain/whisperX

Lightning-AI/litgpt

modelscope/ms-swift

s3prl/s3prl

QwenLM/Qwen-Audio

QwenLM/Qwen2-Audio

stanfordnlp/pyreft

yousinix/portfolYOU

k2-fsa/icefall

DmitryRyumin/INTERSPEECH-2023-24-Papers

santi-pdp/pase

jonatasgrosman/huggingsound

carlthome/python-audio-effects

oliverguhr/wav2vec2-live

PolyAI-LDN/pheme

marekrei/sequence-labeler

YuanGongND/gopt

JusperLee/SPMamba

ga642381/SpeechPrompt

aleXiehta/PhoneFortifiedPerceptualLoss

lstrgar/self-supervised-phone-segmentation

archiki/Robust-E2E-ASR

articulatory/articulatory

JazminVidal/gop-dnn-epadb

Observeai-Research/Phoneme-BERT

JuanPZuluaga/accent-recog-slt2022

juice500ml/dysarthria-gop

doheejin/SB_loss_PA

hcraighead/automated-english-transcription-grader