Pinned Repositories
AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
adaptive_voice_conversion
AdaSpeech
AdaSpeech: Adaptive Text to Speech for Custom Voice
Advanced-Deep-Learning-with-Keras
Advanced Deep Learning with Keras, published by Packt
AFILM
MLSP 2021 Self-Attention for Audio Super-resolution - Keras implementation
AGAIN-VC
This is the official implementation of the paper AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization.
agfzb-CloudAppDevelopment_Capstone
ai-research-code
voice conversion (and other stuff)
APNet2
Source code of APNet2, a vocoder
Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
oytunturk's Repositories
oytunturk/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
oytunturk/BABE2
oytunturk/cuda_practice
My own repository containing the codes I wrote to practice CUDA programming.
oytunturk/cutlet
Japanese to romaji converter in Python
oytunturk/DAMO-ConvAI
DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.
oytunturk/demo_vc
oytunturk/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
oytunturk/ECSS
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI2024)
oytunturk/fadtk
A simple library for Fréchet Audio Distance (FAD) calculation
oytunturk/FasterViT
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention
oytunturk/frechet-audio-distance
A lightweight library for Frechet Audio Distance calculation.
oytunturk/Generative_Deep_Learning_2nd_Edition
The official code repository for the second edition of the O'Reilly book Generative Deep Learning: Teaching Machines to Paint, Write, Compose and Play.
oytunturk/I2I-Mamba
Official implementation of I2I-Mamba, an image-to-image translation model based on selective state spaces
oytunturk/IMS-Toucan
Multilingual and Controllable Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart.
oytunturk/keras-io
Keras documentation, hosted live at keras.io
oytunturk/mamba
oytunturk/Mamba-TasNet
oytunturk/naturalspeech3_facodec
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3
oytunturk/open_clip
An open source implementation of CLIP.
oytunturk/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
oytunturk/RAD-MMM
A TTS model that makes a speaker speak new languages
oytunturk/Retrieval-based-Voice-Conversion-WebUI
Voice data <= 10 mins can also be used to train a good VC model!
oytunturk/Samba
Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"
oytunturk/speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
oytunturk/ssamba
An official implementation for SSAMBA: Self-Supervised Audio Mamba
oytunturk/StableTTS
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
oytunturk/ThunderKittens
Tile primitives for speedy kernels
oytunturk/vampnet
music generation with masked transformers!
oytunturk/vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
oytunturk/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild