NoneWait's Stars
langgenius/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
2noise/ChatTTS
A generative speech model for daily dialogue.
stanford-oval/storm
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
LargeWorldModel/LWM
jina-ai/reader
Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
langchain-ai/langgraph
Build resilient language agents as graphs.
xiangyuecn/Recorder
html5 js 录音 mp3 wav ogg webm amr g711a g711u 格式,支持pc和Android、iOS部分浏览器、Hybrid App(提供Android iOS App源码)、微信,提供ASR语音识别转文字 H5版语音通话聊天示例 DTMF编码解码
pytorch/torchtune
A Native-PyTorch Library for LLM Fine-tuning
metavoiceio/metavoice-src
Foundational model for human-like, expressive TTS
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
0nutation/SpeechGPT
SpeechGPT Series: Speech Large Language Models
SkyworkAI/Skywork
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。
sh-lee-prml/HierSpeechpp
The official implementation of HierSpeech++
gemelo-ai/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
HITsz-TMG/UMOE-Scaling-Unified-Multimodal-LLMs
The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"
OpenMOSS/AnyGPT
Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"
zhuzilin/ring-flash-attention
Ring attention implementation with flash attention
NVIDIA/NeMo-Aligner
Scalable toolkit for efficient model alignment
RLHFlow/Online-RLHF
A recipe for online RLHF and online iterative DPO.
feifeibear/long-context-attention
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
fastai/nbdev_template
Template for nbdev projects
frutik/Awesome-RAG
nttcslab-sp/kaldiio
A pure python module for reading and writing kaldi ark files
IAAR-Shanghai/CRUD_RAG
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models
OpenT2S/LlamaVoice
LlamaVoice is a llama-based large voice generation model, providing inference and training ability.
THUDM/LongAlign
[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs
OFA-Sys/Ditto
A self-ailgnment method for role-play. Benchmark for role-play. Resources for "Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment".
bojone/FSQ
Keras implement of Finite Scalar Quantization