j-easy/easy-batch

Is it possible to provide a BatchRecordProcessor?

andrehanika opened this issue · 3 comments

Hi easy-batch team

We have the use case to synchronize data between two databases (source=db1, target=db2). We load batch wise from db1 and in the record processor we are loading the corresponding entity from db2 for further data synchronization. For performance reasons it would be nice to have a BatchRecordProcessor that receives the current loaded batch from db1 so that we can load the data from db2 batch wise as well instead of one single load every time. A workaround would be to use the Writer or the afterBatchProcessing, which receives the current batch, but then we shortcut the validators.
Is there another workaround or solution already?

Thank you and regards,
André

No parallel processing.
db1 = source
db2 = target
E = Entity

Reader: loads E1, E2, E3, E4 from db1
Processor db1.E1: load db2.E1 -> sync data db1.E1 to db2.E1
Processor db1.E2: load db2.E1 -> sync data db1.E2 to db2.E2
Processor db1.E3: load db2.E1 -> sync data db1.E3 to db2.E3
Processor db1.E4: load db2.E1 -> sync data db1.E4 to db2.E4
Writer: persist[db2.E1, db2.E2, db2.E3, db2.E4]

Reader: loads E1, E2, E3, E4 from db1
BatchProcessor [db1.E1, db1.E2, db1.E3, db1.E4]: load all corresponding entities from db2
Writer: persist[db2.E1, db2.E2, db2.E3, db2.E4]

Writer

@andrehanika The processing model you are looking for was implemented in v4:

batch-processing-v4

where it was possible to process records in batches. However, this model had several design issues which were explained in #211 and fixed in v5+.

I don't know why you excluded parallel processing, because I think the way suggested by @marcusdemian is a good option for you since data synchronization tasks are independent (I guess). Something similar to the fork/join tutorial might work for you. Otherwise, you can use afterBatchProcessing and do the synchronization and manual validation there.

I'm closing this issue but feel free to add a comment if you need further support.