Grafink is a spark ETL job to load data into Janusgraph.
sbt compile
To compile against scala 2.11
sbt ++2.11.11 compile
This project uses scalafmt to format code. For formatting code:
sbt scalafmt // format sources
sbt test:scalafmt // format test sources
sbt sbt:scalafmt // format.sbt source
sbt test
sbt assembly
sbt dist
The above creates a deployable zip file grafink-<version>.zip
. The contents of the zip file are:
- conf/application.conf // Modify this config file according to the job requirements.
- grafink assembly jar // The main executable jar for running spark job.
- bin/start.sh // The main executable script that user can invoke to start the job.
For compiling and packaging against scala 2.11:
sbt ++2.11.11 dist
./bin/start.sh --config conf/application.conf --startdate <yyyy-MM-dd> --duration 1 --num-executors 2 --driver-memory 2g --executor-memory 2g