Migrator Tool with Change Streams Missing Some Documents
Closed this issue ยท 2 comments
๐ Hey,
We're attempting to use the Migrator Tool with Change Streams to move data from DocumentDB to Sharded MongoDB (i.e. we're issuing the writes to a MongoS instance)
It appears to work for the most part, but misses a few 10's of documents so not every document is transferred.
For example our exchanges.league_event
collection in DocumentDB has 681,653
documents.
The migrator tool moves 681,601
documents to MongoDB and then the remaining 52
documents are never transferred
These are the logs from the tool
python3 cdc-multiprocess.py --source-uri 'mongodb://aarontest:<PASSWORD>i@t-aarontest-docdb-240528-1229-9f.cluster-cxismqp4lkcr.us-east-1.docdb.amazonaws.com:27018/?tls=true&tlsCAFile=global-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false' --target-uri 172.27.79.116:27017 --source-namespace exchanges.league_event --use-change-stream --start-position NOW
[2024-05-30T14:46:33.797Z] thread -1 | processing changeStream using 1 threads
[2024-05-30T14:46:33.797Z] thread -1 | starting with timestamp = 2024-05-30 14:46:33+00:00
[2024-05-30T14:47:33.933Z] elapsed 00:01:00.05 | total o/s 0.00 | interval o/s 0.00 | tot 0 | 0 secs behind | resume token = N/A
[2024-05-30T14:48:33.993Z] elapsed 00:02:00.11 | total o/s 20.82 | interval o/s 41.64 | tot 2,501 | 3 secs behind | resume token = 01665891be00005b9c0100005b9c00109397
[2024-05-30T14:49:34.136Z] elapsed 00:03:00.17 | total o/s 304.17 | interval o/s 870.80 | tot 54,801 | 31 secs behind | resume token = 01665891df0000298e010000298e00109397
[2024-05-30T14:50:34.405Z] elapsed 00:04:00.31 | total o/s 913.41 | interval o/s 2,738.43 | tot 219,501 | 58 secs behind | resume token = 0166589200000006c301000006c300109397
[2024-05-30T14:51:34.617Z] elapsed 00:05:00.57 | total o/s 1,282.89 | interval o/s 2,756.27 | tot 385,601 | 86 secs behind | resume token = 016658922000001feb0100001feb00109397
[2024-05-30T14:52:34.874Z] elapsed 00:06:00.79 | total o/s 1,684.09 | interval o/s 3,686.65 | tot 607,601 | 102 secs behind | resume token = 016658924c00000643010000064300109397
[2024-05-30T14:53:34.995Z] elapsed 00:07:01.05 | total o/s 1,618.82 | interval o/s 1,228.01 | tot 681,601 | 147 secs behind | resume token = 016658925b00001645010000164500109397
[2024-05-30T14:54:35.054Z] elapsed 00:08:01.17 | total o/s 1,416.55 | interval o/s 0.00 | tot 681,601 | 208 secs behind | resume token = 016658925b00001645010000164500109397
[2024-05-30T14:55:35.102Z] elapsed 00:09:01.22 | total o/s 1,259.39 | interval o/s 0.00 | tot 681,601 | 268 secs behind | resume token = 016658925b00001645010000164500109397
The logs suggest it gets to 681,601
documents and then the secs behind
just keeps growing.
We start the tool to monitor the Change Streams and transfer documents, minutes before we start inserting documents into the DocDB, so it shouldn't be a case of those 52 documents have expired from the Change Stream.
We see the exact same thing with another collection where there are 340,375
documents in DocDB and the tool only moves 340,301
documents to MongoDB (missing 74
)
Hey,
I did some more debugging today and figured out whats going on here....
The collections I was transferring from DocDB -> Sharded MongoDB were not getting any updates
The script buffers updates from the Change Stream and then Bulk Writes them if either:
However this check is only executed when an event arrives on the Change Stream
So because these collections are not being actively written to the final batch of documents are never written.
If I write a fake document to the collection, the script realises that more than 5 seconds have passed since the last write and flushes the buffer of documents
I guess not really a bug, more a misunderstanding of how its supposed to work, so closing this