pyspark-dataframes
There are 3 repositories under pyspark-dataframes topic.
sbl-sdsc/df-parallel
Comparison of Dataframe libraries for parallel processing of large tabular files on CPU and GPU.
mhaseebtariq/pyspark-helpers
Useful helper functions for PySpark dataframe operations
RJBarker/home_sales
Use PySpark and SparkSQL to execute SQL queries through a temporary view of the DataFrame created. Conduct additional queries on cached and partitioned data to determine runtime comparisons.