wxbool's Stars
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
ollama/ollama
Get up and running with Llama 3, Mistral, Gemma, and other large language models.
huggingface/candle
Minimalist ML framework for Rust
jasonppy/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Zejun-Yang/AniPortrait
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
lich0821/WeChatFerry
微信机器人底层框架,可接入Gemini、ChatGPT、ChatGLM、讯飞星火、Tigerbot等大模型。WeChat Robot Hook.
huggingface/parler-tts
Inference and training library for high-quality TTS models.
xinyu1205/recognize-anything
Open-source and strong foundation image recognition models.
facebookresearch/denoiser
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
TMElyralab/MuseTalk
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
pemistahl/lingua-go
The most accurate natural language detection library for Go, suitable for short text and mixed-language text
nihui/realsr-ncnn-vulkan
RealSR super resolution implemented with ncnn library
declare-lab/tango
A family of diffusion models for text-to-audio generation.
showlab/Image2Paragraph
[A toolbox for fun.] Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.
QPT-Family/QPT
[内测中]QPT - 致力于让开源项目更好通往互联网世界的Python to EXE工具(Python打包)。
axodox/axodox-machinelearning
This repository contains a pure C++ ONNX implementation of multiple offline AI models, such as StableDiffusion (1.5 and XL), ControlNet, Midas, HED and OpenPose.
muesli/kmeans
k-means clustering algorithm implementation written in Go
gotranspile/cxgo
Tool for transpiling C to Go.
axodox/unpaint
A simple Windows / Xbox app for generating AI images with Stable Diffusion.
KdaiP/StableTTS
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
numediart/EmoV-DB
The Emotional Voices Database: Towards Controlling the Emotional Expressiveness in Voice Generation Systems
uthree/tinyvc
a lightweight voice conversion
jack139/go-infer
Go framework for DL model inference and API deployment
chenyangMl/keyword-spot
端到端语音唤醒工具箱,从模型训练到模型推理。
NextAudioGen/ultimatevocalremover_api
API for a Vocal Remover that uses Deep Neural Networks.
instant-high/wav2lip-onnx-HQ
Full version of wav2lip-onnx including face alignment and face enhancement and more...
spkgyk/RTFS-Net
Official code release for "RTFS-Net: Recurrent time-frequency modelling for efficient audio-visual speech separation", accepted ICLR 2024
Okrio/tinyrecurrentunet
Real-Time De-noising and De-reverbing with Tiny Recurrent UNet
yvonwin/qwen2.cpp
qwen2 and llama3 cpp implementation
fxkt-tech/liv
friendly ffmpeg wrap for go.