Write stream data to MongoDB
fracek opened this issue · 4 comments
fracek commented
Is your feature request related to a problem? Please describe.
Many use cases involve transforming data and then writing it into MongoDB. This database is then used to implement the API used by the frontend.
Describe the solution you'd like
We should add a sink that writes stream data directly to MongoDB, without the need to implement a typescript/python program that does it.
Additional context
This are a couple of things to keep in mind:
- Write all records to the same collection, this collection will be a command line argument to the sink.
- The "transform" step should return an array of records, the sink probably should check this is the case and if not print an helpful error message.
- Add a
_cursor
field to each record containing theend_cursor.order_key
of the batch. - On "invalidate", delete all records where
_cursor > cursor.order_key
, wherecursor
is the cursor in the invalidate message.
bigherc18 commented
I'd be happy to work on this one
fracek commented
Great! Thank you
bigherc18 commented
Why should the "transform" step return an array ?
fracek commented
I believe in 99% of the cases when that's not the case it's a bug in the transform step. That way we can print an helpful message that explain exactly where's the issue.