souvikqb

Pinned Repositories

metavoice-src
Foundational model for human-like, expressive TTS
Language:Python4k 80 128668
PixArt-sigma
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
Language:Python1.7k 40 12884
pheme
Language:Python254 11 2025
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Language:Python0 0 00
bark-voice-cloning-HuBERT-quantizer
The code for the bark-voicecloning model. Training and inference.
Language:Python0 0 00
bark-with-voice-clone
🔊 Text-prompted Generative Audio Model - With the ability to clone voices
Language:Jupyter Notebook0 0 00
PhotoMaker
PhotoMaker
Language:Jupyter Notebook0 0 00
SadTalker
[CVPR 2023] SadTalker：Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Language:Python0 0 00

souvikqb's Repositories

souvikqb/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Language:Python0 0 00
souvikqb/bark-voice-cloning-HuBERT-quantizer
The code for the bark-voicecloning model. Training and inference.
Language:Python0 0 00
souvikqb/bark-with-voice-clone
🔊 Text-prompted Generative Audio Model - With the ability to clone voices
Language:Jupyter Notebook0 0 00
souvikqb/PhotoMaker
PhotoMaker
Language:Jupyter Notebook0 0 00
souvikqb/SadTalker
[CVPR 2023] SadTalker：Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Language:Python0 0 00