julien-duponchelle/python-mysql-replication

Suggestion: distinguish ambiguous column value of `None`

heehehe opened this issue · 0 comments

We found there are three cases of getting None for column value.

  1. When the user set the column value as "NULL".
    if self._is_null(null_bitmap, null_bitmap_index):
    return None
  2. When the datatype is related to date(DATETIME, TIME, ...) and the value's format is not supportable.
    def __read_datetime(self):
    value = self.packet.read_uint64()
    if value == 0: # nasty mysql 0000-00-00 dates
    return None
    date = value / 1000000
    time = int(value % 1000000)
    year = int(date / 10000)
    month = int((date % 10000) / 100)
    day = int(date % 100)
    if year == 0 or month == 0 or day == 0:
    return None
  3. When the datatype is SET and the value is empty
    elif column.type == FIELD_TYPE.SET:
    # We read set columns as a bitmap telling us which options
    # are enabled
    bit_mask = self.packet.read_uint_by_size(column.size)
    return (
    set(
    val
    for idx, val in enumerate(column.set_values)
    if bit_mask & 2**idx
    )
    or None
    )

Since these values have the same value in python-mysql-replication as None, users cannot distinguish these cases.
Is it fine to distinguish these cases when dump() is executed like below?

CREATE TABLE test_table (col0 int, col1 varchar(10), col2 datetime);
INSERT INTO test_table VALUES (1, 'abc', '2023-09-09 00:00:00');
UPDATE test_table SET col1=NULL, col2='0000-00-00 00:00:00' WHERE col0=1;
# dump result for UPDATE

=== UpdateRowsEvent ===
Date: 2023-09-09T05:23:14
Log position: 1360
Event size: 37
Read bytes: 13
Table: test.test_table
Affected columns: 3
Changed rows: 1
Affected columns: 3
Values:
--
*col0:1=>1
*col1:abc=>None(null)
*col2:2023-09-09 00:00:00=>None(out of datetime range)

Here's a draft of our work : python-mysql-replication-kr#86
I'd appreciate for your feedback :)