/SparkSQL_Workload

Use SparkSQL to measure datastore performance.

Primary LanguagePython

SparkSQL DataStore Benchmark on Mesos

The Workload

The workload read data from Softlayer Object Storage with Spark Swift integration, then write data to a DataStore using Spark SQL. The run results, throughput and latency, are stored as CSV file in Softlayer Object Storage too.

A sample run against ElasticSearch

export marathonIp=MARATHON_IP
curl -i -H 'Content-Type: application/json' -d@config/es/marathon-es.json $marathonIp:8080/v2/apps

Other run examples

The Spark Mesos Docker Image

The image is used for both Spark job submission and Spark executor on Mesos, besides it can be used to start Spark Standalone cluster.

Reference