square/shift

Shift creating shadow table

Dhanasekar93 opened this issue ยท 10 comments

After completion of migration, the shift is creating a shadow table like below

+---------------------------+
| Tables_in_test            |
+---------------------------+
| 20181004125700887_sbtest1 |
| 20181004130253090_sbtest5 |
| 20181004130952087_sbtest5 |
| 20181005112052953_sbtest5 |
| 20181005112924812_sbtest5 |
| 20181005112924891_sbtest5 |
| 20181005112924892_sbtest5 |
| 20181005113204700_sbtest5 |
| 20181005113431865_sbtest5 |
| 20181005121503073_sbtest5 |
| sbtest1                   |
| sbtest2                   |
| sbtest3                   |
| sbtest4                   |
| sbtest5                   |
+---------------------------+```

Do we have any parameters to stop this ?

Currently, no, although I think this was an oversight because we definitely should make it possible to just drop the old table if you don't want to keep it around.

This will be an overhead for big tables. We can skip the rename migration part and make it to drop the old table.

Also wanted to know, do we have any scheduled migration kind of process ?

Just to provide some background on why it is the way it is - we define the "pending_drops_db" configuration option on the runner to be a special db, and we have a cron job that drops tables that are older than 7 days from that db. I agree, though, that we should provide a way to just drop the table immediately after the migration finishes.

You cannot schedule things to run at a specific time in the future, if that's what you're asking.

cadl commented

@Dhanasekar93 @michaelfinch How about adding a parameter like enable_trash? I would like to create PR about it.

That sounds good, I'll definitely accept that PR. Please update the docs as well

Alternatively, if pending_drops_db is empty in the config, we could just make it automatically drop the old table.

cadl commented

@michaelfinch I have created a pull request about it~ #55

Still not working.

Even with enable_trash is true in my config.

Config:

[root@mydbopslabs11 runner]# cat config/development-config.yaml 
# config for the database client
mysql_user: root
mysql_password: '********'
mysql_cert:
mysql_key:
mysql_rootCA:
mysql_defaults_file: config/my_development.cnf

# config for the rest client
rest_api: http://127.0.0.1:3000/api/v1/
rest_cert:
rest_key:

# general config
log_dir: /tmp/shift/
pt_osc_path: pt-online-schema-change
pending_drops_db:
enable_trash: true
log_sync_interval: 10
state_sync_interval: 10
stop_file_path: /tmp/stop_shift_runner

# optionally override the host/port/db to run an OSC on
host_override:
port_override:
database_override:

Table list in DB:

+---------------------------+
| Tables_in_sbtest          |
+---------------------------+
| 20181228070300649_sbtest1 |
| 20181228070553098_sbtest1 |
| 20181228070936012_sbtest1 |
| 20181228071104426_sbtest1 |
| 20181228071333155_sbtest1 |
| 20181228071448272_sbtest1 |
| 20181228071909914_sbtest1 |
| sbtest1                   |
| sbtest2                   |
| sbtest3                   |
| sbtest4                   |
| sbtest5                   |
+---------------------------+

If enable_trash is false it is dropping the shadow table

Isn't that the behavior we want? From the README: enable_trash: boolean. if true, the runner move the original table into the pending_drops_db with a timestamp prefix name after non-shortrun alter/drop table.

So if enable_trash is false, it should drop the shadow table. If it is true, it should move the shadow table into the pending_drops db.