BinWang28's Stars
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
meta-llama/llama3
The official Meta Llama 3 GitHub site
QwenLM/Qwen2
Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
facebookresearch/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
huggingface/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
facebookresearch/chameleon
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
jianfch/stable-ts
Transcription, forced alignment, and audio indexing with OpenAI's Whisper
QwenLM/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
jishengpeng/WavTokenizer
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
prometheus-eval/prometheus-eval
Evaluate your LLM's response with Prometheus and GPT4 💯
ga642381/speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
NVIDIA/enroot
A simple yet powerful tool to turn traditional container/OS images into unprivileged sandboxes.
Yuan-ManX/ai-audio-datasets
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
NVIDIA/pyxis
Container plugin for Slurm Workload Manager
IndoNLP/nusa-crowd
A collaborative project to collect datasets in Indonesian languages.
AI4Bharat/IndicTrans2
Translation models for 22 scheduled languages of India
NVIDIA/audio-flamingo
PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.
aiverify-foundation/moonshot
Moonshot - A simple and modular tool to evaluate and red-team any LLM application.
AudioLLMs/AudioLLM
Audio Large Language Models
homebrewltd/llama3-s
Llama3.1 learns to Listen
AudioLLMs/AudioBench
AudioBench: A Universal Benchmark for Audio Large Language Models
wsntxxn/AudioCaption
Audio captioning recipe
Labbeti/aac-metrics
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
mulab-mir/muchomusic
MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.
SeaEval/SeaEval
NAACL 2024: SeaEval for Multilingual Foundation Models: From Cross-Lingual Alignment to Cultural Reasoning
openaudiolab/LLaST
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
SeaEval/CRAFT
ACL 2024 Workshop: CRAFT: Extracting and Tuning Cultural Instructions from the Wild
zouxunlong/web_crawl