ham-p's Stars
modelscope/KAN-TTS
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
OpenMOSS/MOSS
An open-source tool-augmented conversational language model from Fudan University
0hq/WebGPT
Run GPT model on the browser with WebGPU. An implementation of GPT inference in less than ~1500 lines of vanilla Javascript.
Sharrnah/whispering
Whispering Tiger - OpenAI's whisper (and other models) with OSC and Websocket support. Allowing live transcription / translation in VRChat and Overlays in most Streaming Applications
ggerganov/llama.cpp
LLM inference in C/C++
innnky/so-vits-svc
基于vits与softvc的歌声音色转换模型
skywalker023/sodaverse
🥤🧑🏻🚀Code and dataset for our EMNLP 2023 paper - "SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization"
thesephist/monocle
Universal personal search engine, powered by a full text search algorithm written in pure Ink, indexing Linus's blogs and private note archives, contacts, tweets, and over a decade of journals.
promptslab/Promptify
Prompt Engineering | Prompt Versioning | Use GPT or other prompt based models to get structured output. Join our discord for Prompt-Engineering, LLMs and other latest research
mpetazzoni/sse.js
A flexible Server-Sent Events EventSource polyfill for Javascript
j2kun/imsdb_download_all_scripts
Download all plaintext scripts from imsdb.com
Hiswe/vh-check
mobile vh unit utility
Tomiinek/Multilingual_Text_to_Speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
NVIDIA/tacotron2
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
quoth/fastapi-cloud-logging
bsolomon1124/pycld3
Python3 bindings for the Compact Language Detector v3 (CLD3)
saffsd/langid.py
Stand-alone language identification system
wooorm/franc
Natural language detection
Rezmason/matrix
matrix (web-based green code rain, made with love)
chrisguttandin/extendable-media-recorder
An extendable drop-in replacement for the native MediaRecorder.
mozilla/DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
openai/point-e
Point cloud diffusion for 3D model synthesis
xtermjs/xterm.js
A terminal for the web
CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
neonbjb/tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
ashawkey/stable-dreamfusion
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
shirayu/whispering
Streaming transcriber with whisper
eladrich/latent-nerf
Official Implementation for "Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures"