julien-duponchelle/python-mysql-replication

I can't retrieve the updated or written field names; all I get are things like UNKNOWN_COL0.

Opened this issue · 15 comments

import base64
from datetime import date, datetime
import json
import traceback
import pymysql
from pymysqlreplication import BinLogStreamReader
from pymysqlreplication.row_event import (
DeleteRowsEvent,
UpdateRowsEvent,
WriteRowsEvent,
TableMapEvent
)

def default(obj):
if isinstance(obj, datetime):
return obj.isoformat()
elif isinstance(obj, date):
return obj.isoformat()
elif isinstance(obj, bytes):
return base64.b64encode(obj).decode('ascii')
raise TypeError(f"Object of type {obj.class.name} is not JSON serializable")

class BinlogListener:
def init(self, mysql_settings):
self.mysql_settings = mysql_settings
self.stream = None
self.table_map = {}

def start_stream(self):
    events = [DeleteRowsEvent, UpdateRowsEvent, WriteRowsEvent]
    
    self.stream = BinLogStreamReader(
        connection_settings=self.mysql_settings,
        server_id=101,
        only_events=events,
        resume_stream=True,
        blocking=True,
    )

def process_events(self):
    if self.stream is None:
        self.start_stream()

    for binlogevent in self.stream:
        try:
            binlogevent.dump()  
            for row in binlogevent.rows:
                
                event = {"schema": binlogevent.schema, "table": binlogevent.table}
                print(event)
                if isinstance(binlogevent, DeleteRowsEvent):
                    event["action"] = "delete"
                    event["data"] = row["values"]

                elif isinstance(binlogevent, UpdateRowsEvent):
                    event["action"] = "update"
                    event["data"] = row["after_values"] 

                elif isinstance(binlogevent, WriteRowsEvent):
                    event["action"] = "insert"
                    event["data"] = row["values"]

                print(json.dumps(event, default=default))

        except Exception as e:
            traceback.print_exc()


def stop_stream(self):
    if self.stream is not None:
        self.stream.close()
        self.stream = None

Do I need to update the MySQL configuration to include column names?

I was mistaken, indeed we need to add the following MySQL configuration

to enable FULL row metadata and row images in the binlog

binlog_row_metadata=FULL
binlog_row_image=FULL

I'm having a similar issue. When I run the code on my local connecting to an AWS RDS DB column names are present but same code deployed to AWS Fargate connecting to the same RDS instance with same credentials spits out UNKOWN_COL0, UNKNOWN_COL1, etc...

Has anyone ever run into anything like this ?

@pstanescu
Set global variable

binlog_row_metadata=FULL
binlog_row_image=FULL

@sean-k1 Where do you set those as they're already set in the RDS parameter group for the instance running mysql.

@pstanescu

  1. Go to the RDS dashboard in AWS Management Console.
  2. Navigate to Parameter Groups and select the relevant parameter group attached to your MySQL instance.
  3. Edit the parameters for binlog_row_metadata and binlog_row_image to FULL.
  4. Save the changes and then reboot the instance for the changes to take effect if required.

Would you please state the vendor and version of the compatible database that you're using next time you create an issue?
The more info you provide, the sooner we'll solve the issue.

Thanks @dongwook-chan - I already did that as I mentioned in my prior message. Those parameters are already applied.
Results after running SHOW VARIABLES LIKE 'BINLOG_ROW_IMAGE'; SHOW VARIABLES LIKE 'BINLOG_ROW_METADATA'; SHOW VARIABLES LIKE 'version';
'version', '8.0.35'
'binlog_row_image','FULL'
'binlog_row_metadata','FULL'

As I stated in my original ask - same DB instance when I connect to it from my local machine everything works as expected but when I deploy and run the code from AWS Fargate task which connects to the same DB instance I get the UNKNOWN_COL0, etc...

Any other thoughts on this @sean-k1 / @dongwook-chan ? By the way, 0.30.1 works fine, it's just when upgrading to version 1.0.8 it stops showing column names properly

@pstanescu
The data of the points before setting binlog_row_image ,binlog_row_metadata = FULL are shown as UNKNOWN_COL.
Before setting the global variable, binlog would not have that column data.

After you set that variable, Your binlog file has column Data so you should see the column mapping come out correctly.

I am also running into this issue, but with MySQL 5.7 where binlog_row_metadata variable is not supported. Is there another way to handle this? I have binlog_row_image=FULL and binlog_format=ROW currently. I noticed @dongwook-chan has worked with pg_chameleon which is where I am encountering this issue. Any thoughts?

Once I downgraded to 0.30.* or 0.45.1 I had no issues. FYI @pmcgannon22

@pmcgannon22
Just like @pstanescu said, the only solution would be to downgrade at the moment.

Have the same problem, Using AWS Aurora


+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| binlog_format | ROW   |
+---------------+-------+
1 row in set (0.34 sec)

mysql> SHOW GLOBAL VARIABLES LIKE 'binlog_row_image';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| binlog_row_image | FULL  |
+------------------+-------+
1 row in set (0.01 sec)

mysql> SHOW VARIABLES LIKE 'version';
+---------------+--------+
| Variable_name | Value  |
+---------------+--------+
| version       | 8.0.34 |
+---------------+--------+
1 row in set (0.00 sec)

@jiajie999 set binlog_row_metadata=FULL

@sean-k1 The account does not have permission to set "binlog_row_metadata", how to solve this issue?