TODO list
zzt93 opened this issue · 3 comments
TODO
-
Cold-start (ETL) opt
- hold buffer full handle
- hold change after now
-
Test Framework more
- join
- performance test
-
Add
/stat
,/input
endpoint for syncer -
Timezone config
-
Convert MySQL integer as byte array
- add auto conversion with meta info fetched: unsigned int: long, unsigned long: byte array
-
Dependency module: not package & load if not use mongo sync etc.
- SLF4j
- Nginx
-
IncludeBefore & IncludeUpdated config?
-
rpm
anddpkg
-
Sync check: query input & output for comparing
- Implement by a special
SyncData
? - Should has a http endpoint to invoke it.
- Implement by a special
-
Warning if multiple
schema.table
has different rows -
A better serializer than json, which lost the type info: PB?
Avro? -
Test redis output & nested sql
-
Config thread for each consumer
-
Support set parent of ES
-
Row image format support?
- Add must appeared field restriction -- now only primary key
- Opt: keep only changed field in update event & primary key in delete event -- include must appear field
-
Batch module & failure module is coupled with channel module
- Filter chain?
- Failure module as last channel?
-
MDC.put eventId is necessary??
Done:
- Output need customization of Spring EL -- Remove spring EL
- Mysql input field check
- Add new cold start: batch select (order by id) & batch insert
- Shorten id:
- change serverId to port/clientId?
- serverId: not for unique purpose, but for debug -- removed, to save memory
- variable integer encoding for position: xxx/123456/gap/xxx
- shorten offset
- change serverId to port/clientId?
- Support set start binlog file name & position in config file (make it easier to rebuild)
- Refactor
clone
&dup
semantic -- change tocreate
- Reduce memory footprint of
StandardEvaluationContext
(20% memory reduction) - Add file as data source: to read binlog file
- Update failure log format: not escape json string
- Order problem: make same id to same thread; strict mode: retry error item and all left; retry only error item
- Output channel reconnection logic: MySQL & ES
- Adjust logging level dynamically
- Add health check endpoint
upsert
for es output channel if 404- Add shutdown hook to do clean up: stop sending data to output target, avoid dup key exception
- Update to Spring Boot 2.0 for better yaml prompt when config
- Skip synced item if already synced when startup
- Add kafka output channel
- kafka msg consumer has to handle event idempotently;
- send event using primary key as
key
- deploy
SyncData
SyncUtil
as separate jar to maven central
- Refactor config naming:
input:
masters:
- connection:
address: ${HOST_ADDRESS}
port: 27018
type: Mongo
repos:
- name: "chat"
entities:
- name: messages
fields: [time, content]
- Package refactor:
- For
syncer-data
deploy - Refactor
config
package
- For
- Add kafka version compatiblity in readme.
- Reduce useless dependency: remove spring boot
- Refactor filter module design flaw & add nested
if
and/or enhanceswitcher
- Use
javassist/cglib/byte buddyJavaCompiler
to generate code dynamically rather than spring el - Support config key like
lower-hyphen
- Binlog checksum type auto detection
- kafka MESSAGE TOO LARGE
- Share same table definition for multiple remote
- Test framework
- Refactor
SyncData
:update event
should havebefore
&fields
data:- add
updated()
&udpated(String name)
method for use - add
before
to get before data
- add
- Test framework: add update/delete test
- Update README config example: remove and link to test config dir.
- Test framework: mongo
- Check MongoDB whether registered db/collection is exists
- Batch buffer bug
- Opt logging: Ack log, MasterConnector
- Connect to latest binlog flag (cold start usage)
- de-register cold-start consumer?
- or use same consumer, different filter?
- Add consumerId in log
- or report thread-consumer relation in http port
- or change thread name to
syncer-consumerId-filter-1
- ConsumerId syntax check: not support
-
FileBasedMap
record last removed position if map is empty- Change from tailing oplog to use change stream api: check mongo version when startup
- ES output channel support
nested
obj - Alter table auto re-sync mysql column index so no need to restart
- ES client upgrade (5.x, 7.x, not all features, 6.x all features) -- rest client & basic auth to replace xpack & low level rest client
- Test framework:
- Mongo update/delete
- Change filter module to single thread, add partition key support in syncData which will be used in output module (multiple thread)
- Order problem when id is changed: add scheduler key
Joining like this will inevitably cause data inconsistency because theat-least-once-semantic
, not do.- ES can make it by nested obj
- Kafka need this
- Filter module not shutdown but use failure log- Pressure test continue
- Degradation & Bound queue size: change to fixed sized queue
- Column filter:
_all
- Cold start
Testing & Implementing
-
Update position even not interested in
-
Share storage in k8sMode
- Sync meta info to ZK like
- k8sMode need a instanceId to differentiate
- storage path
/instanceId/syncer/xx
- storage path
- config file?
-
Kafka output: timestamp to long;
-
Mysql output: auto add id;
-
#8 [Test Pending] MySQL upsert support: for join table order problem -- ref
-
[Impl Pending] Update sync meta position when consumer not interested in this event?
- Implement by a simple position flusher typed event?
- emit when trying to shutdown?
- emit when
num
not interested event happened
Not Do
- Schema mis-match problem -- fix by new cold-start method --
ETL
- Write schema of all tables to local file, then parse all DDL to update it.
- Start to load schema from files
- Cold start
- connect to latest binlog (can't resolve mis-match in this situation)
- Netty as http client (idempotence is hard to achieve)
- Support rpc output channel (idempotence is hard to achieve)
- Support websocket for long lived connection (idempotence is hard to achieve)
- Join by query extra data source in output?
- Make output module non-blocking with callback, so reduce filter-output thread?
- May cause disorder of event -- make it as config option:
non-block-mode
- May cause disorder of event -- make it as config option: