cytoscape/cytoscape-explore

Import / export should be non-blocking

Opened this issue · 5 comments

Description of new feature

What should the new feature do? For visual features, include an image/mockup of the expected output.

Importing and exporting data on the server should be non-blocking (via streaming etc.).

Motivation for new feature

Describe your use case for this new feature.

  1. Generally, the event loop for node should not be blocked. See the docs, if you're not already familiar with this. Streaming helps to keep the event unblocked.
  2. Streaming can be applied trivially to the Cytoscape JSON imports by using a JSON stream. I don't think these routes are used yet, but it would be nice to have something more production-ready than the existing prototype.
  3. If streaming is difficult for NDEX/CX conversions, then we should at least yield regularly (e.g. in loops).

I am pretty sure CX is considered valid JSON. So maybe JSON stream can be applied for cx import export as well.

Maybe. Possibly not if CX is complex (e.g. dependencies in the structure). The Cy JSON is just a basic list, so it's definitely streamable. In any case, the important part is that sooner or later the import/export stuff shouldn't block -- even periodic yielding would be sufficient. Everything else on the server is just simple I/O, which node is good/fast at. The import/export stuff is currently not.

We don't need those sort of changes now, but they would be needed in some way to make the system production-ready. So I wanted to start a discussion issue that would eventually turn into a work item so that we can start to think about it, plan, etc.

I think it shouldn't be hard to change the server side import function to an async function. Are you thinking of allowing the client to access the server side document while the server function is still constructing it? The syncher object will need to maintain the data integrity in the db during the creation process.

I think it shouldn't be hard to change the server side import function to an async function.

Some options may be more difficult than others in practice. The streaming approach would be ideal, since it is better on memory than the simple yielding approach. Anything would be better than what we have now, and that's the main thing

Are you thinking of allowing the client to access the server side document while the server function is still constructing it?

That probably should just be forbidden to keep things simple

The syncher object will need to maintain the data integrity in the db during the creation process.

Yeah, it's probably sufficient to set a flag once the import has completed. The route or the editor could forbid responses when the flag isn't set