
此repository是Apache Mesos(10)-使用Chronos创建复杂任务的示例代码。对Mesos in Action的wordcount-example的代码进行了小的变动。

This is an example Spark job that reads a copy of Leo Tolstoy's War and Peace from HDFS, counts the number of times each word appears, then stores the word counts in a text file (also on HDFS). This is meant to be used with the Chronos jobs located at ../complex-etl-job.


Clone the repo and package up the example:

$ git clone
$ cd spark-wordcount/wordcount-example
$ sbt package

Assuming the spark-submit utility is available on the $PATH of your gateway machine, submit the job by running the following command:

$ spark-submit target/scala-2.10/war-and-peace-wordcount_2.10-0.1.0-SNAPSHOT.jar \


The results of the job can then be found on HDFS at ${basepath}/warandpeace-counts.txt. You can get the top 10 words in the book by running the following command:

$ hadoop fs -cat /tmp/warandpeace/warandpeace-counts.txt/part-* | sort -t, -rnk2 | head -10
