julien-duponchelle/python-mysql-replication

Possible bad alignment of data when decoding the query event

the4thdoctor opened this issue ยท 9 comments

I have issues when processing the query event.
The problem is not present in version 0.26.
It seems it was introduced in version 0.28 and it's still present in 0.30.

When capturing the query event sometimes the first character of the query is truncated and then when the decode moves forward unexpected data appears crashing the binlogstream process because of utf-8 issues.

It seems to me an issue with the wrong read starting point and length for the query event packet

Server version Mariadb 10.6.5-MariaDB-log Source distribution
Python version Python 3.9.12

How to reproduce the issue

Start the database server and run the following script (adjust your connection parameters accordingly):

from pymysqlreplication import BinLogStreamReader
from pymysqlreplication.event import QueryEvent

mysql_settings = {'host': 'localhost', 'port': 3306, 'user': 'usr_replica', 'passwd': 'replica'}

stream = BinLogStreamReader(connection_settings = mysql_settings, server_id=1002, blocking=True,only_events = [QueryEvent], )

for binlogevent in stream:
    print(binlogevent.schema.decode())
    print(binlogevent.query)

stream.close()

Create a new table on the server:

CREATE TABLE test (
  id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
  value1 VARCHAR(45) NOT NULL,
  value2 VARCHAR(45) NOT NULL,
  last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  date_create TIMESTAMP NOT NULL,
  PRIMARY KEY  (id),
  KEY idx_actor_last_name (value2)
)ENGINE=InnoDB DEFAULT CHARSET=utf8;

DROP TABLE test ;

The script will output something like this.

15:10 $ python test.py

# Dumm

# Dumm

# Dum
sakila
CREATE TABLE test (
  id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
  value1 VARCHAR(45) NOT NULL,
  value2 VARCHAR(45) NOT NULL,
  last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  date_create TIMESTAMP NOT NULL,
  PRIMARY KEY  (id),
  KEY idx_actor_last_name (value2)
)ENGINE=InnoDB DEFAULT CHARSET=utf8

# Dum
Traceback (most recent call last):
  File "/home/thedoctor/git/pg_chameleon/test.py", line 9, in <module>
    for binlogevent in stream:
  File "/home/thedoctor/git/python-mysql-replication/pymysqlreplication/binlogstream.py", line 496, in fetchone
    binlog_event = BinLogPacketWrapper(pkt, self.table_map,
  File "/home/thedoctor/git/python-mysql-replication/pymysqlreplication/packet.py", line 136, in __init__
    self.event = event_class(self, event_size_without_header, table_map,
  File "/home/thedoctor/git/python-mysql-replication/pymysqlreplication/event.py", line 202, in __init__
    self.query = self.packet.read(event_size - 13 - self.status_vars_length
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x84 in position 42: invalid start byte

Thanks!

@the4thdoctor
I'll fix this bug ASAP.
Thx for reporting.

Thanks :)

Hi, any update on this?
ta!

@dongwook-chan any news? ta!

we run into the same issue, want to check if there is any update for this issue.

@ruiyang2015
I'm so sorry. I haven't been active on issue for quite a while for personal matter. I'll have this resolved right away.

cc. @the4thdoctor

No worries, I had a similar issue last year. I hope all is fine now :)

I confirm the issue is now solved ๐Ÿฅณ
Thank you for the help

@the4thdoctor
Thank you for testing the new release! I'm so relieved now... I'll make sure to respond more quickly next time! I appreciate your support for the library!