Apache Spark (PySpark) Scripts and References
This repo contains Spark code, written in python (using the PySpark API). Feel free to copy and use as-in. Let me know if you have any questions or feedback regarding any of the code.
Zeppelin Notebook Hub (can be used to view Zeppelin notebooks, in json format): https://www.zeppelinhub.com/viewer/
References:
Apache Spark Quickstart
Spark PySpark (Python) API
Databricks - Guide
Databricks - Developer Resources
Spark Tuning Guide
Spark Tuning - Garbage Collection
Hortonworks - Spark Reference