spine-tools/spine-items

Support for specifying in which order different items write to the same DS

spine-o-bot opened this issue · 6 comments

In GitLab by @manuelma on Dec 5, 2020, 10:49

There's a use case where the user connects several importers, tools, data stores, etc, to another data store, and of course the behavior should be that data from all those items is written into the DS. At the moment that's what happens, but the order in which data from different items is written depends on dagster orchestration, which is either random or fixed.

The new requirement is a backend so we can control the order, coupled with a user interface for specifying it.

For the UI, one idea is to use the 'rank' numbers somehow. So, if several items have the same number (which means they are allowed to run in parallel) the user would be able to provide an additional ranking (or something like that, not too clear to me at the moment).

For the backend, one idea is to use a server (I believe this is what web applications do when they need to 'notify' a client of something). So basically, when an item wants to write to a db, it asks permission from the server. The server knows the right order, so it can determine if it's the turn for the item to write or not. This can be managed by blocking sockets, so if one of the items arrives too early to write, it will wait on the socket till the server says 'go on'. This of course requires running the sever on the 'recipient' DS item, and passing the server address to all incoming items (probably in the resource that is advertised 'backwards').

In GitLab by @manuelma on Dec 5, 2020, 10:50

changed the description

In GitLab by @manuelma on Dec 5, 2020, 10:50

changed the description

In GitLab by @manuelma on Dec 5, 2020, 10:53

To clarify, a much simpler alternative for the backend would be to use a python multiprocessing.Lock, but I don't know how to pass a Lock in the ProjectItemResource. Can the Lock be serialized? This needs to be investigated.

On the flip side, passing a server address in the ProjectItemResource feels completely ok.

In GitLab by @jkiviluo on Dec 7, 2020, 10:39

In the UI side, we also discussed using 'light' (e.g. dashed) arrows to establish the order and not to pass data. The numbers would be there too, but they would be based on the order coming from the arrows. Numbers alone might get messed up when new items are added (the arrows can maintain this better, since they indicate which one needs to come before).

In GitLab by @manuelma on Dec 10, 2020, 09:50

assigned to @manuelma

In GitLab by @PekkaSavolainen on Dec 22, 2020, 12:57

moved to spine-tools/Spine-Toolbox#934