Spark-TPC-DS
Spark job for the TPC-DS benchmark.
This code uses this library from Databricks: https://github.com/databricks/spark-sql-perf
To compile put the jar compiled from the above library in lib/ and then run build/sbt assembly
To execute the following arguments must be provided:
- HDFS data location ("/user/test/tpcds-data")
- scale factor (10)
- HDFS result location ("/user/test/tpcds-results")
- N. iterations
- query to execute:
- impalakit
- interactive
- reporting
- deepAnalytics
- simple
- dsdgenDir