julien-duponchelle/python-mysql-replication

AssertionError: Result length not requested length

Closed this issue ยท 24 comments

I am using 'mysql-replication==0.25'
pymysql version 'pymysql==0.9.3',
MySQL RDS version = 5.7
Still getting the error
AssertionError: Result length not requested length:
Expected=2958. Actual=30. Position: 64250. Data Length: 64280

Traceback (most recent call last):

File "/usr/local/lib/python3.7/site-packages/tap_mysql/lib/streamer.py", line 173, in _extract_records

for row in binlog_event.rows:

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/row_event.py", line 443, in rows

self._fetch_rows()

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/row_event.py", line 438, in _fetch_rows

self.__rows.append(self._fetch_one_row())

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/row_event.py", line 491, in _fetch_one_row

row["values"] = self._read_column_data(self.columns_present_bitmap)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/row_event.py", line 191, in _read_column_data

values[name] = self.packet.read_binary_json(column.length_size)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 350, in read_binary_json

return self.read_binary_json_type(t, length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 355, in read_binary_json_type

return self.read_binary_json_object(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 438, in read_binary_json_object

data = self.read_binary_json_type(t, length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 355, in read_binary_json_type

return self.read_binary_json_object(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 438, in read_binary_json_object

data = self.read_binary_json_type(t, length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 355, in read_binary_json_type

return self.read_binary_json_object(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 438, in read_binary_json_object

data = self.read_binary_json_type(t, length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 355, in read_binary_json_type

return self.read_binary_json_object(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 438, in read_binary_json_object

data = self.read_binary_json_type(t, length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 355, in read_binary_json_type

return self.read_binary_json_object(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 438, in read_binary_json_object

data = self.read_binary_json_type(t, length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 357, in read_binary_json_type

return self.read_binary_json_array(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 463, in read_binary_json_array

return [_read(x) for x in values_type_offset_inline]

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 463, in <listcomp>

return [_read(x) for x in values_type_offset_inline]

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 461, in _read

return self.read_binary_json_type(x[0], length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 355, in read_binary_json_type

return self.read_binary_json_object(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 438, in read_binary_json_object

data = self.read_binary_json_type(t, length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 357, in read_binary_json_type

return self.read_binary_json_array(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 463, in read_binary_json_array

return [_read(x) for x in values_type_offset_inline]

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 463, in <listcomp>

return [_read(x) for x in values_type_offset_inline]

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 461, in _read

return self.read_binary_json_type(x[0], length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 355, in read_binary_json_type



return self.read_binary_json_object(length - 1, large)=

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 430, in read_binary_json_object

Copy
keys = [self.read(x[1]) for x in key_offset_lengths]

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 430, in <listcomp>

keys = [self.read(x[1]) for x in key_offset_lengths]



File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 150, in read

return data + self.packet.read(size - len(data))

File "/usr/local/lib/python3.7/site-packages/pymysql/protocol.py", line 73, in read

raise AssertionError(error)

AssertionError: Result length not requested length:


Expected=2958. Actual=30. Position: 64250. Data Length: 64280

@noman55 hello let me know Binary Log Formats value
and Please let me know how I can reproduce this issue.

@sean-k1
you can reproduce a similar issue if you run the below steps.
DB setting

  1. MySQL V 8.0.31
  2. binlog_format = ROW
  3. mysql-replication==0.43
CREATE TABLE setting_table
(
    id SERIAL AUTO_INCREMENT,
    setting JSON
)

INSERT INTO setting_table (setting) VALUES ('{"btn": true, "model": false}');

UPDATE setting_table
        SET setting = JSON_REMOVE(setting, '$.model')
       WHERE id=1;
Traceback (most recent call last):
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/app.py", line 129, in <module>
    main()
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/app.py", line 98, in main
    for row in binlogevent.rows:
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysqlreplication/row_event.py", line 480, in rows
    self._fetch_rows()
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysqlreplication/row_event.py", line 475, in _fetch_rows
    self.__rows.append(self._fetch_one_row())
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysqlreplication/row_event.py", line 564, in _fetch_one_row
    row["before_values"] = self._read_column_data(self.columns_present_bitmap)
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysqlreplication/row_event.py", line 115, in _read_column_data
    values[name] = self.__read_values_name(column, null_bitmap, null_bitmap_index,
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysqlreplication/row_event.py", line 231, in __read_values_name
    return self.packet.read_binary_json(column.length_size)
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysqlreplication/packet.py", line 368, in read_binary_json
    payload = self.read(length)
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysqlreplication/packet.py", line 170, in read
    return self.packet.read(size)
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysql/protocol.py", line 74, in read
    raise AssertionError(error)
AssertionError: Result length not requested length:
Expected=1567384932.  Actual=3.  Position: 114.  Data Length: 117

@abdullahmafuz2020 @noman55 I'm pretty not sure This Pr resolve this Issue
Can you check this Issue in this PR?

I need your Many test cases.
IF error goes on, Please let me know how to reproduce this issue.

=== QueryEvent ===
Date: 2023-09-20T07:47:28
Log position: 978
Event size: 142
Read bytes: 142
Schema: b'test'
Execution time: 0
Query: CREATE TABLE setting_table
(
    id SERIAL AUTO_INCREMENT,
    setting JSON
)

=== QueryEvent ===
Date: 2023-09-20T07:47:28
Log position: 1132
Event size: 52
Read bytes: 52
Schema: b'test'
Execution time: 0
Query: BEGIN

=== TableMapEvent ===
Date: 2023-09-20T07:47:28
Log position: 1212
Event size: 57
Read bytes: 57
Table id: 296
Schema: test
Table: setting_table
Columns: 2
=== OptionalMetaData ===
unsigned_column_list: [True, False]
default_charset_collation: None
charset_collation: {}
column_charset: []
column_name_list: ['id', 'setting']
set_str_value_list : []
set_enum_str_value_list : []
geometry_type_list : []
simple_primary_key_list: [0]
primary_keys_with_prefix: {}
visibility_list: [True, True]
charset_collation_list: []
enum_and_set_collation_list: []

=== WriteRowsEvent ===
Date: 2023-09-20T07:47:28
Log position: 1287
Event size: 52
Read bytes: 12
Table: test.setting_table
Affected columns: 2
Changed rows: 1
Column Name Information Flag: True
Values:
--
* id : 1
* setting : {b'btn': True, b'model': False}

=== XidEvent ===
Date: 2023-09-20T07:47:28
Log position: 1318
Event size: 8
Read bytes: 8
Transaction ID: 18004

=== QueryEvent ===
Date: 2023-09-20T07:54:27
Log position: 1481
Event size: 61
Read bytes: 61
Schema: b'test'
Execution time: 0
Query: BEGIN

=== TableMapEvent ===
Date: 2023-09-20T07:54:27
Log position: 1561
Event size: 57
Read bytes: 57
Table id: 296
Schema: test
Table: setting_table
Columns: 2
=== OptionalMetaData ===
unsigned_column_list: [True, False]
default_charset_collation: None
charset_collation: {}
column_charset: []
column_name_list: ['id', 'setting']
set_str_value_list : []
set_enum_str_value_list : []
geometry_type_list : []
simple_primary_key_list: [0]
primary_keys_with_prefix: {}
visibility_list: [True, True]
charset_collation_list: []
enum_and_set_collation_list: []

=== UpdateRowsEvent ===
Date: 2023-09-20T07:54:27
Log position: 1677
Event size: 93
Read bytes: 13
Table: test.setting_table
Affected columns: 2
Changed rows: 1
Column Name Information Flag: True
Values:
--
*id:1=>1
*setting:{b'btn': True, b'model': False}=>{b'btn': True}

=== XidEvent ===
Date: 2023-09-20T07:54:27
Log position: 1708
Event size: 8
Read bytes: 8
Transaction ID: 18192

@sean-k1
after enabling "BINLOG_ROW_METADATA = FULL"
I have tested bug fix json Parser PR locally in MYSQL DB, and it's working fine.
tried a few different test cases, and all seem working !!!๐ŸŽ‰

couldn't test on the production database where the actual issue was raised because it required "BINLOG_ROW_METADATA = FULL" to be enabled. To enable metadata setting in prod, it will take some days and to test on a production database.

will test in production and get back to you in a week.

Anyway
Thanks so much for your effort to fix this bug

hey @sean-k1
I deployed a bug fix JSON parser code from your PR in production job.
it worked for 10 hours then the below error popped up
it happened in nested JSON
{"information": {"purpose": null, "Number": null, "Status": null}}

Traceback (most recent call last):
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/app.py", line 137, in <module>
    main()
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/app.py", line 103, in main
    for row in binlogevent.rows:
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/row_event.py", line 492, in rows
    self._fetch_rows()
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/row_event.py", line 487, in _fetch_rows
    self.__rows.append(self._fetch_one_row())
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/row_event.py", line 578, in _fetch_one_row
    row["before_values"] = self._read_column_data(self.columns_present_bitmap)
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/row_event.py", line 121, in _read_column_data
    values[name] = self.__read_values_name(
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/row_event.py", line 230, in __read_values_name
    return self.packet.read_binary_json(column.length_size)
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/packet.py", line 382, in read_binary_json
    value = self.read_binary_json_type(t, length)
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/packet.py", line 390, in read_binary_json_type
    return self.read_binary_json_object(length - 1, large)
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/packet.py", line 487, in read_binary_json_object
    data = self.read_binary_json_type(t, length)
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/packet.py", line 390, in read_binary_json_type
    return self.read_binary_json_object(length - 1, large)
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/packet.py", line 450, in read_binary_json_object
    raise ValueError("Json length is larger than packet length")
ValueError: Json length is larger than packet length

@abdullahmafuz2020 Did you run pull to this update branch ?
Does it error When update?
IF you stuck in partial_update,
we need to develop partial_update_row_event first..

=== WriteRowsEvent ===
Date: 2023-09-23T07:34:19
Log position: 1593
Event size: 88
Read bytes: 12
Table: pymysqlreplication_test.json_test
Affected columns: 2
Changed rows: 1
Column Name Information Flag: True
Values:
--
* id : 2
* json_data : {b'information': {b'Number': None, b'Status': None, b'purpose': None}}

hey @sean-k1
no, I haven't taken the recent update code from your branch, I am running 3 day ago old code
meanwhile, take your recent code and run it
there are some issues, I am facing
("{'schema': 'main_table', 'table': 'job_batches', 'binlog_timestamp': " "1695416511, 'fields': {None: None}, 'type': 'update'}")'

not working as expected, missing fields
but if I run using a 3-day ago code, fields get proper data.
this is how I am initializing BinLogStreamReader locally

    stream = BinLogStreamReader(server_id=5555, 
                                    blocking=True,
                                    connection_settings=MYSQL_SETTINGS,
                                    only_schemas=only_schemas,
                                    # only_tables=sync_tables,
                                    resume_stream=True,
                                    log_file='mysql-bin-changelog.026185',
                                    log_pos=18004628,
                                    only_events=[WriteRowsEvent,
                                                UpdateRowsEvent,
                                                DeleteRowsEvent]
                                    )

please let me know if I am missing some setting/params

@abdullahmafuz2020 I think Setting is not Problem.
because 3days code goes properly.
Can you let me know Your simple Testcase?

@sean-k1
I didn't run any specific test case, I connected the prod replica database directly to the test branch code keeping the log_file and log_pos hard coding locally

@abdullahmafuz2020

Let me summarize my understanding
Based on the PR code I wrote 3 days ago, when you run BinlogFile ("{'schema': 'main_table', 'table': 'job_batches', 'binlog_timestamp': " "1695416511, 'fields': {None: None}, 'type': 'update'}")'
Here, the value of fields is not None, but it is filled with a value, and when you ran the latest PR code, it came out as None, right?

I don't think it's a problem with the BinlogStreamReader settings, it works fine with the same settings as yours, so I think the code I wrote is in error.

So ("{'schema': 'main_table', 'table': 'job_batches', 'binlog_timestamp': " "1695416511, 'fields': {None: None}, 'type': 'update'}")' I was asking about how to reproduce what might be causing the error in your Prod environment BinlogEvent.
Correct me if I understand something wrong!

@sean-k1 yes , you understood right
I was working on reproducing locally.
with recent PR changes, it's happening with any event, update, insert

using setting_table refer above on previous DDL

INSERT INTO db.setting_table (setting) VALUES ('{"btn": true, "model": false}');

new code result

{'schema': 'db', 'table': 'setting_table', 'binlog_timestamp': 1695472221, "
 "'fields': {None: {b'btn': True, b'model': False}}, 'type': 'insert'}
{'schema': 'db', 'table': 'setting_table', 'binlog_timestamp': 1695472345, "
 "'fields': {None: {b'btn': False, b'model': False}}, 'type': 'update'}
{'schema': 'db', 'table': 'setting_table', 'binlog_timestamp': 1695472398, "
 "'fields': {None: {b'btn': True, b'model': False}}, 'type': 'delete'}

3 day ago code result

{'schema': 'db', 'table': 'setting_table', 'binlog_timestamp': 1695472221, "
 "'fields': {'id': 24, 'setting': {b'btn': True, b'model': False}}, 'type': "
 "'insert'}
{'schema': 'db', 'table': 'setting_table', 'binlog_timestamp': 1695472345, "
 "'fields': {'id': 18, 'setting': {b'btn': False, b'model': False}}, 'type': "
 "'update'}
{'schema': 'db', 'table': 'setting_table', 'binlog_timestamp': 1695472398, "
 "'fields': {'id': 19, 'setting': {b'btn': True, b'model': False}}, 'type': "
 "'delete'}

@sean-k1 I updated the code output result, I misplaced the result as was running 2 codes in 2 different terminals.๐Ÿ˜•
please have a look at result again

@abdullahmafuz2020 I think another pr merged main branch .. this one is problem.
Can you wait Just moment?

@abdullahmafuz2020 so sorry can you run latest pr branch
I'm missing something while merging a conflicted branch

seems working fine now ๐Ÿš€ and Json length is larger than packet length issue is not coming so far
will deploy this branch in prod, and will update here if any issue pops up
thanks @sean-k1

Hi @abdullahmafuz2020 @sean-k1 Let me also deploy and test it. Sorry was occupied with something else couldn't get back earlier. Just to be sure I need to deploy from this PR branch right? https://github.com/julien-duponchelle/python-mysql-replication/pull/481/files

@noman55 yes but some partial_update is not supported and time,timestamp is still develop now

Remind this one

@noman55 oh your mysqlserver is 5.7 I think Column name is None..

@noman55 @abdullahmafuz2020 I develop time, timestamp
but i said If you update json property UpdateEvent Cannot find.

We would love to develop a PARTIAL_UPDATE_ROW_EVENT right away, but it will take some time to implement it.
Hope you understand.

If you don't get it from the changed data in the json update, but just know that a partial update occurred, I'll add it to the PR.

@sean-k1 thanks for letting us know.

@abdullahmafuz2020 we merged this pr on main branch!
check this out
I fixed when long string cases too.
I hope this PR helps

@noman55 we release 0.45.0 version for Mysql 5.7 Users
I think this version resolved this Issue!