apibara/dna

Write stream data to MongoDB

fracek opened this issue · 4 comments

Is your feature request related to a problem? Please describe.
Many use cases involve transforming data and then writing it into MongoDB. This database is then used to implement the API used by the frontend.

Describe the solution you'd like
We should add a sink that writes stream data directly to MongoDB, without the need to implement a typescript/python program that does it.

Additional context
This are a couple of things to keep in mind:

  • Write all records to the same collection, this collection will be a command line argument to the sink.
  • The "transform" step should return an array of records, the sink probably should check this is the case and if not print an helpful error message.
  • Add a _cursor field to each record containing the end_cursor.order_key of the batch.
  • On "invalidate", delete all records where _cursor > cursor.order_key, where cursor is the cursor in the invalidate message.

I'd be happy to work on this one

Great! Thank you

Why should the "transform" step return an array ?

I believe in 99% of the cases when that's not the case it's a bug in the transform step. That way we can print an helpful message that explain exactly where's the issue.