rsandx's Stars
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
XingangPan/DragGAN
Official Code for DragGAN (SIGGRAPH 2023)
facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
facebookresearch/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
guoyww/AnimateDiff
Official implementation of AnimateDiff.
PaddlePaddle/PaddleGAN
PaddlePaddle GAN library, including lots of interesting applications like First-Order motion transfer, Wav2Lip, picture repair, image editing, photo2cartoon, image style transfer, GPEN, and so on.
openai/jukebox
Code for the paper "Jukebox: A Generative Model for Music"
AILab-CVC/VideoCrafter
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
bryandlee/animegan2-pytorch
PyTorch implementation of AnimeGANv2
minivision-ai/photo2cartoon
人像卡通化探索项目 (photo-to-cartoon translation project)
huggingface/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
williamyang1991/VToonify
[SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer
igorprado/react-notification-system
A complete and totally customizable component for notifications in React
context-labs/autodoc
Experimental toolkit for auto-generating codebase documentation using LLMs
ming024/FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
google-ai-edge/mediapipe-samples
volotat/SD-CN-Animation
This script allows to automate video stylization task using StableDiffusion and ControlNet.
SpeechifyInc/Meta-voicebox
Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.
Emotional-Text-to-Speech/dl-for-emo-tts
:computer: :robot: A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech :speaker:
soumik-kanad/diff2lip
zhw2590582/WFPlayer
:ocean: WFPlayer.js is an audio waveform generator
keonlee9420/Expressive-FastSpeech2
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.
keonlee9420/Cross-Speaker-Emotion-Transfer
PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
jmoso13/jukebox-diffusion
binodswain/react-faq-component
React package to render FAQ section
ganeshmani/react-table-pagination-example
This Repo is a demo for React table Pagination handling 1 million records from server
ceramicwhite/IllusionDiffusion
Fork of huggingface.co/spaces/AP123/IllusionDiffusion
leiyi420/MsEmoTTS
fhixa/CodeScribe
CodeScribe - An Automate way to describe code