audio-processing

There are 3270 repositories under audio-processing topic.

  • SincNet

    SincNet is a neural architecture for efficiently processing raw audio samples.

    Language:Python1.2k
  • StreamSpeech

    StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

    Language:Python1.1k
  • awesome-audio-dsp

    My curated list of audio DSP and plugin development resources

  • audino

    Open source audio annotation tool for humans

    Language:JavaScript1.1k
  • chromaprint

    C library for generating audio fingerprints used by AcoustID

    Language:C++1.1k
  • nnAudio

    nnAudio

    Audio processing by using pytorch 1D convolution network

    Language:Python1.1k
  • DawDreamer

    Digital Audio Workstation with Python; VST instruments/effects, parameter automation, FAUST, JAX, Warp Markers, and JUCE processors

    Language:C++1.1k
  • soundfingerprinting

    soundfingerprinting

    Open source audio fingerprinting in .NET. An efficient algorithm for acoustic fingerprinting written purely in C#.

    Language:C#1k
  • Wave-U-Net

    Implementation of the Wave-U-Net for audio source separation

    Language:Python909
  • SLAM-LLM

    Speech, Language, Audio, Music Processing with Large Language Model

    Language:Python890
  • audio-visualizer-android

    🎵 [Android Library] A light-weight and easy-to-use Audio Visualizer for Android.

    Language:Java883
  • klio

    klio

    Smarter data pipelines for audio.

    Language:Python856
  • Beethoven

    :guitar: A maestro of pitch detection.

    Language:Swift850
  • XR3Player

    🎧 🎼 The MOST ADVANCED JavaFX Media Player

    Language:Java752
  • APT

    AI Productivity Tool - Free and open source, improve user productivity, and protect privacy and data security. Including but not limited to: built-in local exclusive ChatGPT, DeepSeek, Phi, Qwen and other models, one-click batch intelligent processing of pictures, videos, audio, etc.

    Language:C#735
  • Awesome-Audio-LLM

    Audio Large Language Models

    Language:Python719
  • awesome-large-audio-models

    awesome-large-audio-models

    Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

  • DTLN

    Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.

    Language:Python645
  • r8brain-free-src

    r8brain-free-src

    High-quality pro audio resampler / sample rate conversion C++ library. Very fast, for both audio resampling and time-series interpolation.

    Language:C++637
  • fast-music-remover

    A C++ based, lightweight music and noise remover for YouTube and other internet media, using DeepFilterNet for audio enhancement.

    Language:C++627
  • FoleyCrafter

    FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝

    Language:Python626
  • PESQ

    PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)

    Language:C604
  • unsilence

    unsilence

    Console Interface and Library to remove silent parts of a media file 🔈

    Language:Python584
  • vectorhub

    vectorhub

    Vector Hub - Library for easy discovery, and consumption of State-of-the-art models to turn data into vectors. (text2vec, image2vec, video2vec, graph2vec, bert, inception, etc)

  • nara_wpe

    Different implementations of "Weighted Prediction Error" for speech dereverberation

    Language:Python532
  • Dplug

    Make VST2 / VST3 / AU / AAX / CLAP / LV2 / FLP plug-ins for Linux/macOS/Windows, using D.

    Language:D524
  • MediaEditor

    MediaEditor

    A non-linear editing software that helps you to make nice video.

    Language:C++477
  • SamplerBox

    SamplerBox is a sampler musical instrument based on RaspberryPi.

    Language:Python461
  • ltu

    Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

    Language:Python453
  • musig

    A shazam like tool to store songs fingerprints and retrieve them

    Language:Go442
  • surfboard

    Novoic's audio feature extraction library

    Language:Python437
  • jumpcutter

    ⏩ Fast-forwards long pauses between sentences — watch lectures ~1.5x faster (browser extension)

    Language:TypeScript425
  • emotion-classification-from-audio-files

    Understanding emotions from audio files using neural networks and multiple datasets.

    Language:Python419
  • android-vad

    Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.

    Language:C408
  • scaper

    A library for soundscape synthesis and augmentation

    Language:Python407
  • whisper-at

    Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"

    Language:Python404