/ga-project

General Assembly Data Science Project

Data

The publicly available data for this project can be downloaded here: http://www.kaggle.com/c/galaxy-zoo-the-galaxy-challenge/data

However, the team i'm working with has done some processing on these images to prepare them into an easier-to-use, and more consistent format.

Algorithm

I'm working through the tutorials here http://deeplearning.net/tutorial/logreg.html starting from logistic regression, and moving into the Deep Learning classification problem. The final goal of this project is to successfully run the deep learning tutorial on the galaxy data set, and understand at a high level exactly what the deep neural net is doing. Thus, this project mainly revolves around computation on large data sets.

I will try to run the entire algorithm on the entire data set using a powerful AWS instance, probably with a GPU to speed up the Theano computations.