/Speech-Recognition-Tensorflow-Challenge

Different CNN Models for keyword spotting in speech recognition

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

### Update 10/07/2018

## MAJOR CHANGES COMING IN SOON, inlcuding pytorch implementation and better structure

Tensorflow Speech Recognition Challenge
https://www.kaggle.com/c/tensorflow-speech-recognition-challenge
Folders : 
	images: audio clips -> spectrogram images 
	im_train: -> images -> resize to 28x28
	results: results in graphs
	papers: some useful papers
	test_pics : ignore (spectrograms of test audio clips)
	Deprecated : old GCP files. Ignore
Files :
	complete.py -> code with two CNN models and adversarial training
	ReadMe -> this

	Some files were used for preprocessing on older data
	but maybe useful for other projects
	ignore these:
		CNN_code_for_resized_data.py
		dataset.py
		downsizing.py <- recursively resize all images in a folder
		ds.py <- tried an iterator
		pp.py <- audio to image conversion. recursively converts all audio clips in a folder to 
				 corresponding spectrograms
		speech_recog.py <- ignore
		GCP-SR.py <-- for local usage in google cloud platform
		

Models:
Shallow CNN: CNN similar to AlexNet. Two fc layers at the end, dropout enabled/disabled.

Deeper CNN:
	wide  : added more layers to the CNN, removed dropout
	wider : increased number of filters


For Results and Talks:
	ML_final.pdf
	ML_talk.pdf