/gutenberg

Primary LanguageJupyter Notebook

A Text Analysis Using Project Gutenberg

alt text

=======================================

Installation (OSX)

  1. Install the homebrew package manager
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
  1. Install Java from: https://java.com/en/download/

  2. Brew install python3, spark, and scala

brew install python3 apache-spark scala
  1. Set Environment variables for spark/java in your bash_profile. Java can be installed in many places... Examples:
if which java > /dev/null; then export JAVA_HOME=$(/usr/libexec/java_home); fi

# setup spark for jupyter for prototyping
PYSPARK_DRIVER_PYTHON=jupyter
PYSPARK_DRIVER_PYTHON_OPTS='notebook'
  1. Setup a Virtual environment

  2. pip install requirements.txt

  3. Protype anything locally and when ready, run on spark cluster!!