rookie0607

Beijing

Pinned Repositories

espnet
End-to-End Speech Processing Toolkit
Language:Python8.5k 181 2.4k2.2k
SenseVoice
Multilingual Voice Understanding Model
Language:Python3.5k 38 132317
FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Language:Python7k 65 1.2k752
ms-swift
Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
Language:Python4.3k 23 1.3k381
babel_kws
基于kaldi下的babel项目复杂环境下语音关键字检索
0 1 00
FunASR
A Fundamental End-to-End Speech Recognition Toolkit
Language:Python0 0 00
Keyword-Transformer
Implementation of the paper "Keyword Transformer: A Self-Attention Model for Keyword Spotting"
Language:Shell0 0 00
kwa
0 1 00
mmrotate
OpenMMLab Rotated Object Detection Toolbox and Benchmark
Language:Python0 0 00
whisper_streaming
Whisper realtime streaming for long speech-to-text transcription and translation
Language:Python2.1k 36 107255

rookie0607/babel_kws
基于kaldi下的babel项目复杂环境下语音关键字检索
0 1 00
rookie0607/FunASR
A Fundamental End-to-End Speech Recognition Toolkit
Language:Python0 0 00
rookie0607/Keyword-Transformer
Implementation of the paper "Keyword Transformer: A Self-Attention Model for Keyword Spotting"
Language:Shell0 0 00
rookie0607/kwa
0 1 00
rookie0607/mmrotate
OpenMMLab Rotated Object Detection Toolbox and Benchmark
Language:Python0 0 00