Title: Spark versus Flink: Performance Comparisons in Big Data Analysis Framework
Spark and Flink are two Apache -hosted data analytic frameworks that facilitate the analyzing of big datasets. The in-depth understanding of the underlying architecture choices are important to increase the performance of processing data with respect to different datasets.
This project is aimed to justify the performance of Spark and Flink by evaluating their results by processing on streaming datasets; different benchmarks will be considered for this evaluation.
Fault tolerance, a major aspect of stream processing will be discussed, and support for applications such as Machine Learning with respect to stream processing will be discussed.