ParquetCompactor is not deleting old files and the input_file_name_parts directory on S3
Closed this issue · 0 comments
kstrempel commented
ParquetCompactor is not deleting old files and the input_file_name_parts directory on S3.
We are using the spark databricks platform, spark 6.2, pyspark and mrpowers:spark-daria:0.36.0-s_2.11. After running ParquetCompactor we have a new big parquet file, but the old files and the input_file_name_parts directory still exists.
Is it not possible to use the ParquetCompactor on S3?