SparkSQL DataStore Benchmark on Mesos

The Workload

The workload read data from Softlayer Object Storage with Spark Swift integration, then write data to a DataStore using Spark SQL. The run results, throughput and latency, are stored as CSV file in Softlayer Object Storage too.

Reference the high level overview
Reference the workload details in Python using Spark SQL

A sample run against ElasticSearch

export marathonIp=MARATHON_IP
curl -i -H 'Content-Type: application/json' -d@config/es/marathon-es.json $marathonIp:8080/v2/apps

Start ElasticSearch in Marathon JSON
Start workload in Marathon JSON
Start workload with file index pattern in Marathon JSON

Other run examples

Cassandra
Cloudant
CouchDB

The Spark Mesos Docker Image

The image is used for both Spark job submission and Spark executor on Mesos, besides it can be used to start Spark Standalone cluster.

Reference

yanglei99/SparkSQL_Workload

SparkSQL DataStore Benchmark on Mesos

The Workload

A sample run against ElasticSearch

Other run examples

The Spark Mesos Docker Image