livy-submit

General documentation found here. Example usage patterns can be found here

Our intent with livy-submit is that it behaves similarly to spark-submit in "cluster" deploy mode. To enable this functionality, we require WebHDFS or HttpFS to be (1) enabled on the Hadoop cluster and (2) visible from wherever you are running livy-submit. A description of webhdfs and httpfs can be found here from Cloudera.

Network requirements

Your client must be able to see the livy server
Your client must be able to see the Namenode so you can upload your python files to HDFS so that Spark can pull them down at runtime.
You must have webHDFS enabled. (or HttpFS, though that has not been tested yet)

Build docs

Create conda environment:

conda create -n livy-submit-dev --file requrements-dev.txt --file requirements.txt

Build apidoc files (conda activate livy-submit-dev, first)

sphinx-apidoc -f -o docs/source livy_submit

Build docs

cd docs
make html

View docs by opening docs/build/index.html in a web browser

n1tk/livy_test

livy-submit

Network requirements

Build docs