eruma's Stars
microsoft/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
stanford-oval/storm
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
fishaudio/fish-speech
SOTA Open Source TTS
Shubhamsaboo/awesome-llm-apps
Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
actix/actix
Actor framework for Rust.
Cysharp/UniTask
Provides an efficient allocation free async/await integration for Unity.
miurla/morphic
An AI-powered search engine with a generative UI
pipecat-ai/pipecat
Open Source framework for voice and multimodal conversational AI
vocodedev/vocode-core
🤖 Build voice-based LLM agents. Modular + open source.
Camb-ai/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
BasedHardware/Friend
AI wearable necklace
IAHispano/Applio
A simple, high-quality voice conversion tool focused on ease of use and performance.
apple/ml-4m
4M: Massively Multimodal Masked Modeling
OS-Copilot/OS-Copilot
An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.
DigitalPhonetics/IMS-Toucan
Controllable and fast Text-to-Speech for over 7000 languages!
QwenLM/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
MontrealCorpusTools/Montreal-Forced-Aligner
Command line utility for forced alignment using Kaldi
TensorSpeech/TensorFlowASR
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords
Neph0s/awesome-llm-role-playing-with-persona
Awesome-llm-role-playing-with-persona: a curated list of resources for large language models for role-playing with assigned personas
EmulationAI/awesome-large-audio-models
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
rtvi-ai/rtvi-web-demo
Example UI implementing the RTVI web client
lucidrains/e2-tts-pytorch
Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
ensan-hcl/azooKey
azooKey: A Japanese Keyboard iOS Application Fully Developed in Swift
ikegami-yukino/neologdn
Japanese text normalizer for mecab-neologd
isi-nlp/uroman
Universal Romanizer that can convert any unicode script to roman (latin) script
SALT-NLP/demonstrated-feedback
Wataru-Nakata/miipher
Unofficial implementation of miipher
feldberlin/timething
Timething is a library for aligning text transcripts with their audio recordings.
facebookresearch/MemoryMosaics
Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.
zeyuxie29/AudioTime