Visualize High-Dimensional Data Fast | Watson Studio: Blog Here!

Step by step instructions

  1. Download the MNIST handwritten digits sample data set (about 1,000 images per digit) from here. The file's name is mnist_all_sample_10000.csv. If you want to speed up results, you can use a smaller sample of about 150 images per digit here.

  2. Create an account on Watson Studio cloud or download the desktop version here.

  3. Open Watson Studio.

  4. Click New project on the top right to create a new project on Watson Studio.

  5. Name your project and click Create on the bottom right.

  6. Click the Assets tab if you are not already there.

  7. Upload the mnist_all_sample_10000.csv, on the right hand side of the screen drop or browse the file.

  8. In your project, under Data assets, click the data set to see a preview of the data set.

  9. Click the Refine blue box in the top right to open the data set with the Data Refinery tool. This step might take a little while since the 10,000 by 785 dataset is being loaded into the Data Refinery tool.

  10. Once the Data Refinery tool is open, navigate to the Visualizations tab.

  11. Create the t-SNE visualization:

    1. Select the t-SNE chart on the CHART TYPES.
    2. Set the Perplexity parameter to 75.
    3. Select the column "label" as the Color map.
    4. This is the t-SNE visualization after 1000 iterations. Each colored cloud represent a different digit from zero to nine. For instance, the purple cloud represents the images of the number one digit.