Pinned Repositories
metavoice-src
Foundational model for human-like, expressive TTS
PixArt-sigma
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
pheme
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
bark-voice-cloning-HuBERT-quantizer
The code for the bark-voicecloning model. Training and inference.
bark-with-voice-clone
🔊 Text-prompted Generative Audio Model - With the ability to clone voices
PhotoMaker
PhotoMaker
SadTalker
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
souvikqb's Repositories
souvikqb/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
souvikqb/bark-voice-cloning-HuBERT-quantizer
The code for the bark-voicecloning model. Training and inference.
souvikqb/bark-with-voice-clone
🔊 Text-prompted Generative Audio Model - With the ability to clone voices
souvikqb/PhotoMaker
PhotoMaker
souvikqb/SadTalker
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation