big tx stuck at `cleanGtidExecuted -> WaitForAllCommitted`
c0494133d4 opened this issue · 0 comments
c0494133d4 commented
前言: 关于目标端dtle.gtid_executed表
- 每执行一个tx, 增加一行记录, 记录job_name, server_uuid, gno
- 对于每个job每个server_uuid, 每2048行进行一次汇总, 将gtid以区间形式写入
复现
- 启动增量任务
- 该任务下执行若干小事务, 使记录行数接近2048
- 执行大事务
- dtle中, 大事务分割成多个DataEntry执行, 执行完最后一个才进行commit
- 大事务卡住
delve debug可发现卡在如下位置
2 0x0000000002691ecf in github.com/actiontech/dtle/driver/mysql.(*MtsManager).WaitForAllCommitted
at /universe/src/github.com/actiontech/dtle/driver/mysql/applier_mts.go:66
3 0x0000000002687d29 in github.com/actiontech/dtle/driver/mysql.(*ApplierIncr).cleanGtidExecuted
at /universe/src/github.com/actiontech/dtle/driver/mysql/applier_gtid_executed.go:249
4 0x000000000268bd09 in github.com/actiontech/dtle/driver/mysql.(*ApplierIncr).handleEntry
at /universe/src/github.com/actiontech/dtle/driver/mysql/applier_incr.go:309
问题
- 大事务中对于dtle.gtid_executed行数的计数器
gtidSetItem.NRow
被错误增加, 导致cleanGtidExecuted提前执行 - WaitForAllCommitted等待当前事务执行完毕, 最终锁死.