torchaudio

There are 76 repositories under torchaudio topic.

2noise/ChatTTS
A generative speech model for daily dialogue.
Language:Python37.8k 211 6564.1k
DrewThomasson/VoxNovel
VoxNovel: generate audiobooks giving each character a different voice actor.
Language:Python305 9 3032
ujiaqi/MusicRecommend
:star: 本科毕业设计：基于内容的音乐推荐系统设计与开发。使用了Pytorch框架构建训练模型代码，使用Django构建了前后端。
Language:JavaScript238 1 320
KentoNishi/torch-pitch-shift
Pitch-shift audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.
Language:Python137 1 612
nipponjo/tts-arabic-pytorch
TTS models for Arabic (Tacotron2, FastPitch)
Language:Jupyter Notebook119 4 2031
xucailiang/cascade
Cascade is a production-ready, high-performance, and low-latency audio stream processing library designed for Voice Activity Detection (VAD). Built upon the excellent Silero VAD model, Cascade significantly reduces VAD processing latency while maintaining high accuracy through its 1:1:1 binding architecture and asynchronous streaming technology.
Language:Python81
evshiron/rocm_lab
DEPRECATED!
Language:Shell52 12 155
KentoNishi/torch-time-stretch
Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.
Language:Python40 1 33
SekiroRong/KAN-AutoEncoder
KAE : KAN-based AutoEncoder (AE, VAE, VQ-VAE, RVQ, etc.)
Language:Jupyter Notebook36 2 01
torchsmoke/Python3-Wheels
Wheels for Python 3
27 6 63
PINTO0309/pytorch4raspberrypi
Cross-compilation of PyTorch armv7l (32bit) for RaspberryPi OS
Language:Dockerfile20 3 02
overcrash66/OpenTranslator
Open Translator: Speech To Speech and Speech to text Translator with voice cloning and other cool features
Language:Python12 1 03
BakingBrains/Sound_Classification
Sound classification on Urban Sound Dataset
Language:Jupyter Notebook10 1 04
aminul-huq/Speech-Command-Classification
Speech command classification on Speech-Command v0.02 dataset using PyTorch and torchaudio. In this example, three models have been trained using the raw signal waveforms, MFCC features and MelSpectogram features.
Language:Python9 1 15
eonu/torch-fsdd
A utility for wrapping the Free Spoken Digit Dataset into PyTorch-ready data set splits.
Language:Python9 1 23
LukeSutor/programmatic-pitch
High fidelity music synthesis using diffusion and UnivNet.
Language:Python9 2 02
CrispenGari/animal-sound-classification
this is a simple artificial neural network model using deep learning and torch-audio to classify cats and dog sounds.
Language:Jupyter Notebook8 2 02
nipponjo/tts-german-pytorch
TTS (FastPitch) for German (Thorsten voice / emotional)
Language:Python8 2 20
CrispenGari/emotionAI
(😞 😨 😄 😮 😍 😠 😐 🤮) This is a simple DL API that classifies human emotions from audios and text.
Language:Jupyter Notebook7 2 01
glefundes/misophonia-bot
🤖 Telegram bot powered by Deep Learning. Automatically assesses the safety of audios and voice messages for people suffering from misophonia.
Language:Python6 2 10
igorshmukler/kokoro-ruslan
Kokoro Language Model Training Script for Russian (Ruslan Corpus)
Language:Python6
pradeepbatchu/speechtotext
Speech to Text with Wav2Vec2 using torchaudio
Language:Python6 2 01
BaoNguyen6742/uv-install-torch
Tutorial to install torch/pytorch with cuda using uv
Language:PowerShell5 2 0
LukeSutor/guitar_source_separation
The unmix model trained to separate guitar playing from audio samples using a custom-built dataset.
Language:Python5 1 01
LumenPallidium/audio_generation
Experiments in neural networks for audio generation.
Language:Python5 2 00
mehdihosseinimoghadam/Signal-Processing
Signal Processing with Python and Librosa
Language:Jupyter Notebook5 2 02
vectominist/Switchboard-WSJ-Utils
Utilities for preprocessing the Switchboard and WSJ corpora in Python3
Language:Python5 1 01
JoelDeonDsouza/Auto_CNN
This repo implements a deep learning pipeline for classifying environmental sounds from the ESC-50 dataset.
Language:TypeScript4
avrtt/MoE-speech-recognition
Mixture of experts architecture for speech-to-text and language identification, built in PyTorch
Language:Python3
CrispenGari/torch-audio
🎶🎼 This repository contains some notebooks that were used to train Audio Classification models in pytorch using torchaudio.
Language:Jupyter Notebook3 2 0
dhpollack/spokenlanguages
Language:Jupyter Notebook3 3 05
Efenstor/PyTorch-ROCm-gfx1010
Instructions on how to build PyTorch on Debian 12 with support for the AMD gfx1010 architecture
Language:Shell3
manhph2211/DSP101
Building a speaker identification & verification pipeline for Vietnamese voices :sleepy:
Language:Jupyter Notebook3 1 1
NevroHelios/CrossEmotion
MELD-IFEED-Benchmark
Language:Python3
thekartikeyamishra/VoiceCloner
The Voice Cloner is a Python-based project that leverages Tacotron 2 and WaveGlow models for text-to-speech (TTS) synthesis and basic voice cloning. This project supports 22 official Indian languages, including Sanskrit, making it versatile for multilingual text input.
Language:Python3 1 01
yangarbiter/torchaudio-benchmark
TorchAudio: Building Blocks for Audio and Speech Processing
Language:Jupyter Notebook3 4 0

torchaudio

2noise/ChatTTS

DrewThomasson/VoxNovel

ujiaqi/MusicRecommend

KentoNishi/torch-pitch-shift

nipponjo/tts-arabic-pytorch

xucailiang/cascade

evshiron/rocm_lab

KentoNishi/torch-time-stretch

SekiroRong/KAN-AutoEncoder

torchsmoke/Python3-Wheels

PINTO0309/pytorch4raspberrypi

overcrash66/OpenTranslator

BakingBrains/Sound_Classification

aminul-huq/Speech-Command-Classification

eonu/torch-fsdd

LukeSutor/programmatic-pitch

CrispenGari/animal-sound-classification

nipponjo/tts-german-pytorch

CrispenGari/emotionAI

glefundes/misophonia-bot

igorshmukler/kokoro-ruslan

pradeepbatchu/speechtotext

BaoNguyen6742/uv-install-torch

LukeSutor/guitar_source_separation

LumenPallidium/audio_generation

mehdihosseinimoghadam/Signal-Processing

vectominist/Switchboard-WSJ-Utils

JoelDeonDsouza/Auto_CNN

avrtt/MoE-speech-recognition

CrispenGari/torch-audio

dhpollack/spokenlanguages

Efenstor/PyTorch-ROCm-gfx1010

manhph2211/DSP101

NevroHelios/CrossEmotion

thekartikeyamishra/VoiceCloner

yangarbiter/torchaudio-benchmark