zhong-ying-china's Stars
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Vision-CAIR/MiniGPT4-video
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
unslothai/unsloth
Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
bentoml/OpenLLM
Run any open-source LLMs, such as Llama 3.1, Gemma, as OpenAI compatible API endpoint in the cloud.
jianfch/stable-ts
Transcription, forced alignment, and audio indexing with OpenAI's Whisper
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
meta-llama/llama
Inference code for Llama models
wq2012/awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
SpeechFlow-io/Spoken_language_identification
A TensorFlow-based spoken language identification
savoirfairelinux/num2words
Modules to convert numbers to words. 42 --> forty-two
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit