EXPORT_IMPORT data duplication with subsequent runs
dstreev opened this issue · 1 comments
dstreev commented
The default behavior of the IMPORT process doesn't DROP existing data. So additional runs will append to current datasets.
If you're using this process to OVERWRITE an existing table, you may not get the results you'd expect.
dstreev commented
Further research into this shows that this is a normal function of the EXPORT_IMPORT hive process. Precautions should be made to 'reload' the data. hms-mirror is primarily a migration / one-time use tool and doesn't review existing data for these conditions.