Pinned Repositories
3dmsr
3d model shape retrieval
AI-metrics
An open source project to document AI progress through data.
alexa-sign-language-translator
A project to make Amazon Echo respond to sign language using your webcam
ambient-gan
Code to reproduce results from the paper "AmbientGAN: Generative models from lossy measurements"
Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
ancient-text-restoration
Restoring ancient text using deep learning: a case study on Greek epigraphy.
apollo
An open autonomous driving platform
ASR-decoder
it's ASR decoder and make graph project
asr_preprocessing
Python implementation of pre-processing for End-to-End speech recognition
Qoboty's Repositories
Qoboty/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Qoboty/bark
🔊 Text-Prompted Generative Audio Model
Qoboty/Bert-VITS2-ext
基于Bert-VITS2做的表情、动画测试
Qoboty/best-rq-pytorch
Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.
Qoboty/clash
A rule-based tunnel in Go.
Qoboty/HierSpeechpp
The official implementation of HierSpeech++
Qoboty/llark
Code for the paper "LLark: A Multimodal Foundation Model for Music" by Josh Gardner, Simon Durand, Daniel Stoller, and Rachel Bittner.
Qoboty/ltu
Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
Qoboty/magic-animate
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
Qoboty/math-lm
Qoboty/MetaMath
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Qoboty/musicfm
Qoboty/MyHeyGen
Qoboty/OpenVoice
Instant voice cloning by MyShell
Qoboty/parler-tts
Inference and training library for high-quality TTS models.
Qoboty/SiFi-VITS2-44100-Ja
DDPM-based Pitch Generation and Pitch Controllable Voice Synthesis.
Qoboty/SoundStorm
The reproduced code for Google's SoundStorm
Qoboty/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
Qoboty/stable-audio-tools
Generative models for conditional audio generation
Qoboty/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Qoboty/TTS-xtts
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Qoboty/UniAudio
The Open Source Code of UniAudio
Qoboty/UniCATS-CTX-txt2vec
CTX-txt2vec, the acoustic model in UniCATS
Qoboty/UniCATS-CTX-vec2wav
Code for CTX-vec2wav in UniCATS
Qoboty/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
Qoboty/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
Qoboty/Video-LLaVA
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Qoboty/vocode-python
🤖 Build voice-based LLM agents. Modular + open source.
Qoboty/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Qoboty/VoiceFlow-TTS