gilesc's Stars
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
VikParuchuri/surya
OCR, layout analysis, reading order, table recognition in 90+ languages
PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
neuml/txtai
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
SillyTavern/SillyTavern
LLM Frontend for Power Users.
mistralai/mistral-src
Reference implementation of Mistral AI 7B v0.1 model.
OpenAccess-AI-Collective/axolotl
Go ahead and axolotl questions
Zjh-819/LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
davabase/whisper_real_time
Real time transcription with OpenAI Whisper.
collabora/WhisperLive
A nearly-live implementation of OpenAI's Whisper.
Kav-K/GPTDiscord
A robust, all-in-one GPT interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!
FL33TW00D/whisper-turbo
Cross-Platform, GPU Accelerated Whisper 🏎️
Vincentqyw/cv-arxiv-daily
🎓Automatically Update CV Papers Daily using Github Actions (Update Every 2days)
BruceMacD/chatd
Chat with your documents using local AI
quqxui/Awesome-LLM4IE-Papers
Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)
mattbierbaum/arxiv-public-datasets
A set of scripts to grab public datasets from resources related to arXiv
Sharrnah/whispering
Whispering Tiger - OpenAI's whisper (and other models) with OSC and Websocket support. Allowing live transcription / translation in VRChat and Overlays in most Streaming Applications
nexus-stc/stc
Distributed free search engine and AI tools that grant access to knowledge
google/speaker-id
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
neuml/paperetl
📄 ⚙️ ETL processes for medical and scientific papers
AutoLLM/ArxivDigest
ArXiv Digest and Personalized Recommendations using Large Language Models
for-ai/parameter-efficient-moe
shog-ai/shoggoth
Shoggoth is a peer-to-peer network for publishing and distributing open-source Artificial Intelligence
armancohan/arxiv-tools
Tools to bulk download arxiv data
Illyism/openai-whisper-api
OpenAI Whisper API based on Node.js / Bun.sh in a Docker Container + Google Cloud Run Example
nalbion/whisper-server
streaming speech to text server using Whisper
homanp/nagato
🌸 The open framework for question answering fine-tuning LLMs on private data
gkamradt/FineTuningClone
EPFLiGHT/MultiModN
MultiModN – Multimodal, Multi-Task, Interpretable Modular Networks (NeurIPS 2023)
teknium1/teknium1
Config files for my GitHub profile.