/apache-spark-evaluation

Evaluates the execution time differences between RDD (Resilient Distributed Datasets) and DataFrame data structures in Apache Spark. Also takes into account the file format being used, such as CSV or Parquet.

Primary LanguagePythonMIT LicenseMIT

Stargazers