Speech-based Parkinson's Disease Classifier

In this project we use Deep learning to classify Parkinson's disease in four classes. To build an accurate model, we synthesized speech using WaveGAN [paper] and augmented speech using newly developed GAN [github repo].

Getting Started

Parkinson’s disease (PD) is impacting more than 10 million people worldwide. However, the conventional method of diagnosing PD based on physicians’ decision is time consuming and often inaccurate. Hence, we aim to develop an accurate deep learning diagnosis model of PD based on vocal symptoms of hypokinetic dysarthria unique to PD. With the help of deep learning, which does not require complex feature extraction and selection steps necessary in machine learning, we have successfully increased the efficiency as well as the accuracy of the classification model. Furthermore, parameters such as the volume of voice, quality of the recording, and patient’s ethnicity may confound the diagnosis. We propose to implement Generative Adversarial Network (GAN) to revise these parameter values into a standard threshold. The revised inputs, when put into a network that is particularly trained with the threshold, resulted in higher accuracy.

Proposed System

Software

Hardware

Usage

Prerequisites

Tensorflow >= 1.4
Python 3.6

Datasets

Noisy DEMAND 2
Speech Commands Zero through Nine (SC09)
Techsorflow Challenge Speech Commands data full
Parkinson's Disease Extended Vowel Sound dataset (not public)

Baseline PD Classifier

We are using the four class classifier based on CNN. [baseline] [paper]

python3 classifier/speech_classifier/speech.py

The data structure must assume the form of

data/train >
  healthy >
    audio_0.wav
    audio_1.wav
  pd_1 >
    audio_2.wav
    audio_3.wav
  pd_2>
    audio_4.wav
    audio_5.wav

GAN based Speech Synthesis system

We are using WaveGAN and newly proposed GAN model for speech Synthesis from the training data.

References

Little MA, et al. "Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease." IEEE Transactions on Bio-medical Engineering. 2009. [paper]
Tsanas, A., et al. “Novel Speech Signal Processing Algorithms for High-Accuracy Classification of Parkinson's Disease.” IEEE Transactions on Biomedical Engineering. 2012. [paper]
Donahue, Chris, Julian McAuley, and Miller Puckette. "Synthesizing Audio with Generative Adversarial Networks." arXiv preprint arXiv:1802.04208 (2018). [paper]
Shen, Jonathan, et al. "Natural TTS synthesis by conditioning wavenet on mel spectrogram predictions." arXiv preprint arXiv:1712.05884 (2017). [paper]
Perez, Anthony, Chris Proctor, and Archa Jain. Style transfer for prosodic speech. Tech. Rep., Stanford University, 2017. [paper]
Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014. [paper]
Salimans, Tim, et al. "Improved techniques for training gans." Advances in Neural Information Processing Systems. 2016. [paper]
Grinstein, Eric, et al. "Audio style transfer." arXiv preprint arXiv:1710.11385 (2017). [paper]
Pascual, Santiago, Antonio Bonafonte, and Joan Serra. "SEGAN: Speech enhancement generative adversarial network." arXiv preprint arXiv:1703.09452 (2017). [paper]
Yongcheng Jing, Yezhou Yang, Zunlei Feng, Jingwen Ye, Yizhou Yu, Mingli Song. "Neural Style Transfer: A Review" arXiv:1705.04058 (2017) [paper]
Van Den Oord, Aäron, et al. "Wavenet: A generative model for raw audio." CoRR abs/1609.03499 (2016). [paper]
Glow: Generative Flow with Invertible 1×1 Convolutions [paper]
Kingma, Diederik P., et al. "Semi-supervised learning with deep generative models." Advances in Neural Information Processing Systems. 2014. [paper]
Van Den Oord, Aäron, et al. "Wavenet: A generative model for raw audio." CoRR abs/1609.03499 (2016). [paper]

Authors

Chae Young Lee - Hankuk Academy of Foreign Studies - [github]
Anoop Toffy - IIIT Bangalore - [Website]
Seo Yeon Yang - Seoul National University - [Website]

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

Prof. Giovanni Dimauro, Universita' degli Studi di Bari 'Aldo Moro'
Dr. Woo-Jin Han, Netmarble IGS
Dr. Gue Jun Jung, SK Telecom
Tensorflow Korea
Google Korea

tygrin/parkinsons