/audioperm

A python library for generating different permutations of audible segments from audio files.

Primary LanguageJupyter NotebookMIT LicenseMIT

audioperm

Audioperm, a python library for generating different permutations of audible segments from audio files.

Package version Open In Colab


Audioperm

A python library for generating different permutations of audible segments from audio files.

Use:

  • Silence Removal from Audio
  • Audio / Speech augmentation
  • Word segmentation
  • Word level permutation generation
  • Add new synthetic data for deep learning
  • Speaker recognition, Speaker verification, Audio classification, Audio fingerprinting

Documentation: https://zabir-nabil.github.io/audioperm/

Source Code: https://github.com/zabir-nabil/audioperm


Word segmentation

from audioperm import AudioPerm
from audioperm.utils import save_audio

ap = AudioPerm("i_love_cats.m4a")
label = "i love cats"

words = ap.word_segments()
label_words = label.split()

for i, w in enumerate(words):
  save_audio(w, label_words[i] + ".wav")
cats.wav  i_love_cats.m4a  i.wav  love.wav

Word-level permutation

import numpy as np
from audioperm import AudioPerm
from audioperm.utils import save_audio

ap = AudioPerm("i_love_cats.m4a")
ap.word_segments(return_words=False)
perm_sentences = ap.permute(n_permutations = 5)

for i, s in enumerate(perm_sentences):
  s = np.hstack(s).astype(np.int16) # will fix later
  save_audio(s, f"perm_{i}.wav")
cats.wav	   i.wav       perm_1.wav    perm_4.wav
i_love_cats.m4a    love.wav    perm_2.wav    perm_0.wav  
perm_3.wav