julien-duponchelle/python-mysql-replication

Critical Bug - Usage of skip_to_timestamp option causes querying the information schema huge number of times

shivamgly opened this issue · 1 comments

Overview

We are using the library to periodically sync the updated rows to another database. We are using the skip_to_timestamp option to skip the binary logs that were already synced in the previous cycle. But if we use this, we see that the library is executing this query too many times.

Bug description

According to the logic written in row_event.py (line 613), it seems the schema is fetched for every TABLE_MAP_EVENT if it is not already present in the table_map. But in binlogstream.py (lines 551 to 557), the result of the fetched schema is ignored if the event timestamp is lesser than the skip_to_timestamp option. So, the schema will be fetched again in the next TABLE_MAP_EVENT as it was not populated in the table_map previously.

Resolution

Populating the table_map first before continuing the loop in binlogstream.py (lines 551 to 557) should fix the issue.

Hi @shivamgly thanks for the report would like to propose a PR ?