
Spark MOOC setup and labs for DBC users


This repository has lab assignments for Databricks users (dbc format) for the two courses Introduction to Big Data with Apache Spark | edX and Scalable Machine Learning | edX. Please note the instructions below on how to get the Raw form of the dbc files.

If you have a Databricks account on mooc0*.cloud.databricks.com, your shard is all set up. If you are running your own shard, information for setting it up for the course is in the dbc-mooc-setub.dbc notebook here.

If you are using Jupyter/IPython via the standard course VM, see https://github.com/spark-mooc/mooc-setup for the .ipynb files instead.

How to download the .dbc file for a lab - don't be tricked!

Be sure to download the raw archive. Github will happily serve you a file with a .dbc extension which is simply an HTML page describing the file, if you download in the wrong place.

The right method is to first click on the lab you want, and then click on either "View Raw" or the Raw button to download the actual file.