rave974's Stars
Significant-Gravitas/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
AliaksandrSiarohin/first-order-model
This repository contains the source code for the paper First Order Motion Model for Image Animation
neonbjb/tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
BlinkDL/RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
CompVis/latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
google-deepmind/pysc2
StarCraft II Learning Environment
PaddlePaddle/PaddleGAN
PaddlePaddle GAN library, including lots of interesting applications like First-Order motion transfer, Wav2Lip, picture repair, image editing, photo2cartoon, image style transfer, GPEN, and so on.
XavierXiao/Dreambooth-Stable-Diffusion
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
facebookresearch/metaseq
Repo for external large-scale work
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
ThoughtfulDev/EagleEye
Stalk your Friends. Find their Instagram, FB and Twitter Profiles using Image Recognition and Reverse Image Search.
JoePenna/Dreambooth-Stable-Diffusion
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) by way of Textual Inversion (https://arxiv.org/abs/2208.01618) for Stable Diffusion (https://arxiv.org/abs/2112.10752). Tweaks focused on training faces, objects, and styles.
yangxy/GPEN
wq2012/awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
bloc97/CrossAttentionControl
Unofficial implementation of "Prompt-to-Prompt Image Editing with Cross Attention Control" with Stable Diffusion
MycroftAI/mimic3
A fast local neural text to speech engine for Mycroft
harlanhong/CVPR2022-DaGAN
Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation
rhasspy/larynx
End to end text to speech system using gruut and onnx
nepx/halfix
x86 PC emulator that runs both natively and in the browser, via WebAssembly
magic-research/magic-avatar
MagicAvatar: Multimodal Avatar Generation and Animation
taylorlu/Speaker-Diarization
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
lrusso/VirtualXP
Virtual Machine running on a Web browser
AleRapchan/flash-swap-arbitrage-bot
Smart Contract BOT code, running on Ethereum Blockchain, watching for and executing profitable arbitrage opportunities using flash loans and flash swaps.
watzon/fbmdob
Facebook image Metadata Obfuscation server
Victarry/stable-dreambooth
Dreambooth implementation based on Stable Diffusion with minimal code.
bycloudai/GPEN-colab