danzafar/tidyspark

Implement caching and persisting for `spark_tbl`

Closed this issue · 0 comments

This one could be a little tricky. A "child" dataframe of an operation needs to inherit the caching, so may require changes across the code base. Will need to review how SparkR handles this.

will implement two functions:

  • df %>% cache
  • df %>% persist(storage_level)