General documentation found here. Example usage patterns can be found here
Our intent with livy-submit is that it behaves similarly to spark-submit in "cluster" deploy mode. To enable this functionality, we require WebHDFS or HttpFS to be (1) enabled on the Hadoop cluster and (2) visible from wherever you are running livy-submit. A description of webhdfs and httpfs can be found here from Cloudera.
- Your client must be able to see the livy server
- Your client must be able to see the Namenode so you can upload your python files to HDFS so that Spark can pull them down at runtime.
- You must have webHDFS enabled. (or HttpFS, though that has not been tested yet)
- Create conda environment:
conda create -n livy-submit-dev --file requrements-dev.txt --file requirements.txt
- Build apidoc files (
conda activate livy-submit-dev
, first)
sphinx-apidoc -f -o docs/source livy_submit
- Build docs
cd docs
make html
- View docs by opening docs/build/index.html in a web browser