/voice-disciminator

A neural network for filtering target speaker's voice from audio written in tensorflow

Primary LanguagePythonMIT LicenseMIT

Voice Discriminator

Work In Progress

Overview

  • anomaly detection

Environment

  • python 3.5
  • tensorflow-gpu 1.8.0
  • tensorpack 0.8.5

Dataset

  • A set of raw waves of target speaker. label 1
  • A large set of raw waves of non-target speakers. label 0
  • preprocessing required
    • consistent sample rate

TODOs

  • python 2 compatibility
  • hparam xxx/yyy

Related Papers

  • Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples. (2017, November 26). Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples. CoRR.
  • Liang, S., Li, Y., & Srikant, R. (n.d.). ENHANCING THE RELIABILITY OF OUT-OF-DISTRIBUTION IMAGE DETECTION IN NEURAL NETWORKS. Pdfs.Semanticscholar.org.
  • Schlegl, T., Seeböck, P., Waldstein, S. M., Schmidt-Erfurth, U., & Langs, G. (2017, March 17). Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery. arXiv.org.