Building & Running

git clone https://github.com/tbinetruy/Telecom-INF729.git
cd Telecom-INF729

# Extract dataset
tar -xzf prepared_trainingset.tar.gz

# Download and extract Spark 2.3.2
wget https://www-eu.apache.org/dist/spark/spark-2.3.2/spark-2.3.2-bin-hadoop2.7.tgz
tar -xf spark-2.3.2-bin-hadoop2.7.tgz

# Build
sbt package

# Run
./spark-2.3.2-bin-hadoop2.7/bin/spark-submit target/scala-2.11/spark-tp3_2.11-1.0.jar

Output

+------------+-----------+-----+
|final_status|predictions|count|
+------------+-----------+-----+
|           1|        0.0|  963|
|           0|        0.0| 4387|
|           1|        1.0| 2503|
|           0|        1.0| 2879|
+------------+-----------+-----+

f1 score for model: 0.653581958726916