
Primary LanguagePython


Check our presentation


##1-AWS cluster initialization Follow documentation from Using OpsCenter to create a cluster on Amazon EC2

Create the key pair

chmod 400 <my-key-pair>.pem

##2-Launch the AMI (us-east-1 ami-f9a2b690) Follow documentation Installing on Amazon EC2> Launch the AMI

In Advandced details, set parameters:

--version enterprise
--analytics nodes 6
--totalnodes 6
--username datastaxusername
--password datastaxpassword

##3-Connect to Spark Master :

ssh -i <my-key-pair>.pem ubuntu@<ip-master> then launch "dse spark"

##4-Connect to Cassandra Master:

ssh -i <my-key-pair>.pem ubuntu@<ip-master> then launch "cqlsh"

##5-On Spark, import data from S3, preprocess the data and save to Cassandra:

On spark terminal, copy/paste the file sparkCSV.scala

##6-Install Python Librairies on the AMIs: Git clone this repository and then execute

sh config_python.sh

##7- Create keyspaces and tables on Cassandra

CREATE KEYSPACE test WITH replication = {   'class': 'SimpleStrategy',   'replication_factor': 2 };
create table cassandraresult (seismetime text, tel text,lat text,longi text, warnedtime text, PRIMARY KEY (seismetime,tel));
create table test_spark_bigText(t timestamp, id_ville text, tels text, primary key ((t,id_ville)));

##8-Set parameters in Requetage.py Insert the IP adresses of the 5 workers nodes in the table IPaddressesTables (line 124)

IPaddressesTables=['','','','', '']

##9-Launch the python file requetage.py Enter the longitude, latitude and the time values when asked

latitude : 35.01
longitude : 135.0
datetime YYYY-MM-DD HH:MM: 2015-01-25 10:50