This is a wrapper coookbook over hadoop cookbook
This setup creates 3 vagrant boxes with 1 master and 2 slaves
To setup the cluster
- berks vendor cookbooks
- vagrant up --provison
To destroy the cluster
- vagrant destroy -y
Append the following in /etc/hosts file
192.168.33.43 local-spark-cluster-master.org.local local-spark-cluster-master
192.168.33.44 local-spark-cluster-slave01.org.local local-spark-cluster-slave01
192.168.33.45 local-spark-cluster-slave02.org.local local-spark-cluster-slave02
Machine | Name of interface | URI |
---|---|---|
Master | YARN ResourceManager | http://local-spark-cluster-master.org.local:8088/ |
Slave01 | YARN NodeManager | http://local-spark-cluster-slave01.org.local:8042/ |
Slave02 | YARN NodeManager | http://local-spark-cluster-slave02.org.local:8042/ |
Master | Hadoop HDFS NameNode | http://local-spark-cluster-master.org.local:50070/ |
Slave01 | Hadoop HDFS DataNode | http://local-spark-cluster-slave01.org.local:50075/ |
Slave02 | Hadoop HDFS DataNode | http://local-spark-cluster-slave02.org.local:50075/ |
Master | Spark HistoryServer | http://local-spark-cluster-master.org.local:18080/ |
As part of this setup follwing services are configured
Master
- HDFS Namenode
- YARN Resourcemanager
- Spark History Server
Slave
- HDFS Datanode
- YARN Nodemanager
Login to master machine
vagrant ssh master\n
Login as hdfs user
sudo su - hdfs
Spark submit
spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --executor-memory 1G /usr/hdp/2.6.3.0-235/spark2/examples/jars/spark-examples_2.11-2.2.0.2.6.3.0-235.jar