MirrorMaker migration documentation
OneCricketeer opened this issue ยท 7 comments
Regarding the Medium post
Mirus completely replaced Mirror Maker across all production data-centers at Salesforce in April 2018. Since then our data volumes have continued to grow.
For those who are running mirrormaker and have an active consumer group offset for their data and would prefer not to have duplicates after starting Mirus, is there a migration documentation available, or run-book that Salesforce applied for replacement?
No documentation available yet, but I will put something together based on our experience at Salesforce.
+1 :-)
@mtrienis Still on my todo list. The short version is that we shut down Mirror Maker, grabbed the Mirror Maker offsets using kafka-consumer-groups.sh
, then used bin/mirus-offset-tool.sh
with the --reset-offsets
and --from-file
flags to initialize the Mirus connector offsets. Then, when Mirus started it was able to pick up where Mirror Maker left off with no duplicates.
For the first few clusters we actually left Mirror Maker running in parallel for a few minutes, and accepted the duplicates, just to guarantee everything was running as expected. We still used mirus-offset-tool.sh
to initialize our offsets to avoid a flood of duplicates.
Idea:
Could MirusOffsetTool
be extended to capture the offset listing functionality of ConsumerGroupCommand
so that two scripts wouldn't be needed?
@pdavidson100 @Cricket007 Can please share any sample file or format of the file that we supply to MirusOffsetTool
with the flag --from-file
for resetting offsets?
I'm getting error'ed out with not a valid Long value
exception when I try to reset offsets.
@Hari4AMQ The --from-file
format is identical to the output format generated by --describe
, and supports both CSV and JSON (recommended for setting offsets). For example, if you're setting offsets for a 4 partition topic to 100, then the file format might look like this:
{"connectorId":"connector-id","topic":"topic-name","partition":0,"offset":100}
{"connectorId":"connector-id","topic":"topic-name","partition":1,"offset":100}
{"connectorId":"connector-id","topic":"topic-name","partition":2,"offset":100}
{"connectorId":"connector-id","topic":"topic-name","partition":3,"offset":100}
As @pdavidson100 mentioned, you should use the --describe
option first and then edit the output file to the offsets needed of the partitions you want. This command is what I would use to get the offsets for topic t1
:
bin/mirus-offset-tool.sh --properties-file config/<worker.properties> --describe --format json | grep "\"topic\":\"t1\"" > t1-offsets.json
then edit the file t1-offsets.json
with the desired offsets.