OperaDevelop07/Big-Data-Querying-with-Apache-Spark

In this project we make queries that query big data like the movielens dataset that is used here,then we run these experiments to compare perfomance at local machine clusters and at livy server clusters.

Jupyter NotebookMIT

Big-Data-Querying-with-Apache-Spark

In this project we experimented with apache spark queries on big data datasets like the movielens dataset ("https://grouplens.org/datasets/movielens/") and tried to optimise their perfomance both on local cluster scenarios and at cloud/server scenarios like the livy server("https://livy.apache.org/").