pylearn2 tutorial example: "grbm_smd" by Ian Goodfellow This directory contains an example of how to train a model using pylearn2. Specifically, we'll train an RBM with Gaussian units ("grbm") with an objective function called denoising score matching ("smd"). If you are not working from the LISA lab, you'll need to setup the dataset yourself. Be sure to follow the "Download and Installation" instructions at http://deeplearning.net/software/pylearn2/ to set up your PYLEARN2_DATA_PATH directory. Step 0: Download the CIFAR-10 dataset ------------------------------------ People working from the LISA lab can skip this step. To download the CIFAR-10 dataset and put it in the right place, you can call pylearn2/scripts/datasets/download_cifar10.sh from this directory. It will get the python dataset from http://www.cs.utoronto.ca/~kriz/cifar.html and extract it into ${PYLEARN2_DATA_PATH}/cifar10/cifar-10-batches-py Once you have the data, there are three steps to do in pylearn2. First, we create a preprocessed dataset and save it to the filesystem. Then we train a model on that dataset using the train.py script. Finally we can inspect the model to see how the learning experiment went. Step 1: Create the dataset ---------------------------- From this directory, run python make_dataset.py This will take a while. You should read through make_dataset.py to understand what it is doing. The file is heavily commented and is basically a small tutorial on pylearn2 datasets. As a brief summary, this script will load the CIFAR10 database of 32x32 color images. It will extract 8x8 pixel patches from them, run a regularized version of contrast normalization and whitening on them, then save the preprocessing parameters to cifar10_preprocessed_train.pkl and the preprocessed database to train_design.npy. Step 2: Train the model ----------------------- You should have pylearn2/scripts in your PATH enviroment variable. pylearn2/scripts/train.py should have executable permissions. From this directory, run train.py cifar_grbm_smd.yaml This will also take a while. You should read the yaml file for more information. It is heavily commented and is basically a tutorial on yaml files and training models. As a high level summary, it will create a file called cifar_grbm_smd.pkl. This file will contain a gaussian binary RBM trained using denoising score matching. Step 3: Inspect the model ------------------------- pylearn2/scripts/show_weights.py and pylearn2/scripts/plot_monitor.py should have executable permissions. From this directory, run show_weights.py cifar_grbm_smd.pkl A window containing the filters learned by the RBM will appear. They should mostly be colorless gabor filters, though some will be color patches. Note that the filters are still a bit grainy. This is because the example script doesn't train for very long. You can modify cifar_grbm_smd.yaml to train for longer if you would like to see prettier filters. Now close that window. Run plot_monitor.py cifar_grbm_smd.pkl This will display a plot of the objective function over time, so you can see what happened during training.