MohannadEhabBarakat's Stars
AMAI-GmbH/AI-Expert-Roadmap
Roadmap to becoming an Artificial Intelligence Expert in 2022
w-okada/voice-changer
リアルタイムボイスチェンジャー Realtime Voice Changer
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
neonbjb/tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
replicate/cog
Containers for machine learning
rhasspy/piper
A fast, local neural text to speech system
ryo-ma/github-profile-trophy
🏆 Add dynamically generated GitHub Stat Trophies on your readme
yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
espeak-ng/espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
creativetimofficial/material-tailwind
@material-tailwind is an easy-to-use components library for Tailwind CSS and Material Design.
sfikas/medical-imaging-datasets
A list of Medical imaging datasets.
adefossez/demucs
Code for the paper Hybrid Spectrogram and Waveform Source Separation
sh-lee-prml/HierSpeechpp
The official implementation of HierSpeech++
wty-ustc/HairCLIP
[CVPR 2022] HairCLIP: Design Your Hair by Text and Reference Image
laclouis5/globox
A package to read and convert object detection datasets (COCO, YOLO, PascalVOC, LabelMe, CVAT, OpenImage, ...) and evaluate them with COCO and PascalVOC metrics.
WebDevSimplified/Learn-React-In-30-Minutes
VIPL-Audio-Visual-Speech-Understanding/learn-an-effective-lip-reading-model-without-pains
The PyTorch Code and Model In "Learn an Effective Lip Reading Model without Pains", (https://arxiv.org/abs/2011.07557), which reaches the state-of-art performance in LRW-1000 dataset.
manmay-nakhashi/tortoise-tts-fastest
Faster Tortoise inference then Tortoise Fast Fork
NextAudioGen/ultimatevocalremover_api
API for a Vocal Remover that uses Deep Neural Networks.
drengskapur/docker-in-colab
Run Docker inside Google Colab
liangjiubujiu/CTooth
this is the official link to request CTooth
NeuralVox/OpenPhonemizer
An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPL phonemizer.
MikeOfZen/Yet-Another-Openpose-Implementation
This project reimplements from scratch the OpenPose paper (Cao et al,2018), Using Tensorflow 2.1 and optional TPU powered training.
RISE-MICCAI/Journal-Club
The RISE Journal Club aims to create a friendly environment to discuss the latest state-of-the-art papers in the areas of medical image analysis, AI and computer vision. The moderators will briefly introduce the paper and then moderate a discussion where everyone is welcome to provide their thoughts and ask any questions on the paper.
MIC-DKFZ/ACDC2017
CircuitCM/RVC-inference
High performance RVC inferencing, intended for multiple instances in memory at once. Also includes the latest pitch estimator RMVPE, Python 3.8-3.11 compatible, pip installable, memory + performance improvements in the pipeline and model usage.
zsxkib/voice-cloning-training
Voice data <= 10 mins can also be used to train a good VC model!
fakerybakery/txtsplit
A simple text splitter based on Tortoise for use in text-to-speech applications
lucataco/RAVE
RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models - Official Repo
NextAudioGen/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models