A simple script to build a jar with all dependencies needed for your Spark shell session.
To encourage Spark and Scala usage for those who may not have used either before.
One of the challenges of using the spark-shell for data exploration is adding libraries once you get beyond basic functionality. Especially if you are newer to Scala or Java and have not used tools like Maven or Gradle to resolved dependencies before. This script aims to eliminate that overhead and create a single jar with all the dependencies you need.
Once some polish is added to this implementation it should ease exploratory work in the Spark-shell.
Note: This project uses Gradle. You must install Gradle(2.2). If you would rather not install Gradle locally you can use the Gradle Wrapper by replacing all refernces to
gradle
withgradlew
.
- Add any dependencies you like to the config.gradle file.
- Execute
gradle build
to build a jar with all dependencies. The generated jar will be in './build/libs/'
Note: At this point you can copy the jar manually wherever you need. However, deploy functionality was added for convenience. See below.
When executing spark-shell
add the --jars
argument with the path to the generated jar.
Ex: spark-shell --jars ~/spark-shell-deps/spark-shell-deps-*.jar
- Set the localDeployPath in the config.gradle file.
- Execute
gradle deployLocal
to copy the jar to the specified location.
- Set the remoteDeployHost, remoteDeployUser, and remoteDeployPath in the config.gradle file.
- Execute
gradle deployRemote
to copy the jar to the specified host and location. - When prompted enter the shh password for the specified user account.
- Wait for the jar to upload
Execute gradle dependencyUpdates
to list dependencies with new versions available.
You can change any functionality by editing the build.gradle script. However this requires some knowledge of Gradle and the plugins used. See links below.
- Add shell script to (using gradlew) eliminate need for Gradle calls/knowledge.
- Test on windows
- Find better way to reference artifact in Gradle
- Review README