mrpowers-io/spark-daria

ParquetCompactor is not deleting old files and the input_file_name_parts directory on S3

Closed this issue · 0 comments

ParquetCompactor is not deleting old files and the input_file_name_parts directory on S3.

We are using the spark databricks platform, spark 6.2, pyspark and mrpowers:spark-daria:0.36.0-s_2.11. After running ParquetCompactor we have a new big parquet file, but the old files and the input_file_name_parts directory still exists.

Is it not possible to use the ParquetCompactor on S3?