/notebooks

Spark notebooks for working with Cloudant and dashDB data

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

notebooks

Refer to the LICENSE for information about the license under which this code is made available.

This repository includes Spark notebooks for working with Cloudant data.

Import to Cloudant: This notebook is intended for Python 2 with Spark 2.0. It imports SparkSession from pyspark to load a CSV file stored in Bluemix object storage into a dataframe, filters that data, then using the spark-cloudant connector, writes the filtered data to a previoulsy created Cloudant database. This example notebook loads a CSV file containing Child Care providers in Massachusetts downloaded from https://data.mass.gov/Education/Program-list-for-Child-Care-Search-1-15-2015/cb6m-ccic