/SOEN691-BigData

This is a Xueying Li course project for course Big Data

SOEN691-BigData

Title: Spark versus Flink: Performance Comparisons in Big Data Analysis Framework

Spark vs.Flink

Spark and Flink are two Apache -hosted data analytic frameworks that facilitate the analyzing of big datasets. The in-depth understanding of the underlying architecture choices are important to increase the performance of processing data with respect to different datasets.

Aim of Project

This project is aimed to justify the performance of Spark and Flink by evaluating their results by processing on streaming datasets; different benchmarks will be considered for this evaluation.

Fault tolerance, a major aspect of stream processing will be discussed, and support for applications such as Machine Learning with respect to stream processing will be discussed.