mrchristine/db-migration

Error on import hive metastore when table already exists

Closed this issue · 3 comments

When doing an import and the table already exists then an error is generated. The behavior on import of users is different. It automatically creates the users even if they already exist. Could the same be done for the megastore import and other exported items.

I still see two errors when doing an import in Azure. I am using the same cluster for the export. The cluster configuration is 6.5 (includes Apache Spark 2.4.5, Scala 2.11)
Error: Cluster 0915-223815-valid745 is in unexpected state Running.

Second error says:
org.apache.spark.sql.AnalysisException: Table default.generic_data_1_csv already exists.;
{'resultType': 'error', 'summary': 'org.apache.spark.sql.AnalysisException: Table default.generic_data_1_csv already exists.;'

Could it just overwrite without errors generated?

No, I would not want this to automatically overwrite any data / hive metastore entries.
The end user should cleanup the metastore entries themselves. This can take time to delete data, and should be explicitly done by the customer.

I tested this and saw that errors are skipped and we continue to import. I'm closing this out, please let me know if you see different behavior.

I will not add an option to delete / overwrite existing tables as that can delete data.