jkoprax's Stars
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
mozilla/DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
microsoft/semantic-kernel
Integrate cutting-edge LLM technology quickly and easily into your apps
kaldi-asr/kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
espnet/espnet
End-to-End Speech Processing Toolkit
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
mravanelli/pytorch-kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
gcui-art/suno-api
Use API to call the music generation AI of suno.ai, and easily integrate it into agents like GPTs.
bbernhard/signal-cli-rest-api
Dockerized Signal Messenger REST API
modelscope/3D-Speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
juanmc2005/diart
A python package to build AI-powered real-time audio applications
transcriptionstream/transcriptionstream
turnkey self-hosted offline transcription and diarization service with llm summary
audeering/opensmile
The Munich Open-Source Large-Scale Multimedia Feature Extractor
wq2012/SpectralCluster
Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.
hitachi-speech/EEND
End-to-End Neural Diarization
raffael/SISinusWaveView
A Siri like voice input visualizer using EZAudio.
algolia/voice-overlay-android
🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI
flowese/UdioWrapper
UdioWrapper is a Python package that enables the generation of music tracks using Udio's API through textual prompts. This package is based on the reverse engineering of the Udio API (https://www.udio.com/) and is not officially endorsed by Udio.
klinker24/wearable-reply
Simplify text input for Android Wear 2.0, by voice, keyboard, or canned response.
bedangSen/VoiceSens
A Voice Biometric Application using Watson Speech to Text
open-webui/extension
WIP: Open WebUI Chrome Extension (Requires Open WebUI v0.2.0+)
Raymo111/voiceprint
Voice biometric authentication PAM module for Linux
NaveedShahid/Voice-Authentication-CNN
Voice authentication system implementation using Python
open-webui/assistant
No longer actively being worked on, Please use https://github.com/open-webui/extension instead
itmo-mbss-lab/sr_labs_book
The project is related to the development of labs for the ITMO Speaker Recognition Course.
IDRnD/idvoice-gpt-ios-demo
IDVoice + ChatGPT iOS demo app
conceptslabs/thought_factory
schaltung/FishBoardMix
The FishBoardMix corpus is designed to explore Speaker-Age estimation technology.
IDRnD/idvoice-gpt-android-demo
IDVoice + ChatGPT Android demo app
jakubbortlik/accent_rating
A collection of scripts and data I used when working on my dissertation