cunningham-lab/neurocaas

DANDI integration

Opened this issue · 4 comments

Let's meet and discuss DANDI integration

Would it be possible to integrate a dataset selector on the "process" page so that a user could point to an existing DANDI asset instead of uploading a new file?

As a starting point, maybe a user could just enter the uuid or s3 path of an asset which they can already identify using the DANDI archive web interface.

Hey Ben,

Absolutely, let’s meet. We added a few features recently that we expect should make this easier, it would be great to figure out what remains to be done. I’ll reach out via email to schedule a time.

Meeting 08/30 notes:

Goal: Have DANDI/NeuroCAAS implementation for CaImAn.
End Date: September 30th, 2022

Current Roadblocks:

  1. NeuroCAAS implementation of CaImAn may have to be modified to accommodate DANDI files. It may be necessary to expand the expected use cases of the NeuroCAAS implementation or add another argument to let NeuroCAAS find the relevant data within the NWB file.
  2. Bucket bypass as it is currently implemented expects that data and config file come from the same bucket. This will not work for DANDI bc config files don't have a place in that bucket.
  3. For DANDI, it makes most sense for config files to be local. We can then upload them to NeuroCAAS and then host the results from there.

Proposed Solutions:

  1. Edit CaImAn implementation to accommodate DANDI files [2-5 hours].
  2. Adapt bucket bypass for CaImAn to accommodate config and input data from different places. In particular, config from the NeuroCAAS bucket, and input from DANDI bucket [2 hours]
  3. Build lightweight CLI that takes analysis name, data path and config path, and runs on NeuroCAAS. Pull the data upload and polling mechanism from the neurocaas_contrib repo and integrate it here [4 hours]

To implement these solutions, we need:

  • S3 path for example dataset from DANDI (ask for this from @bendichter)
  • config file for this data (use a generic one or see if you can get some reasonable parameters @bendichter)

You can use this s3 asset which is an NWB file that contains 2p data: s3://dandiarchive/blobs/cb2/fc5/cb2fc5d3-3a37-4e1c-bc03-c35cf83db68c

For the config file, I'm not really concerned about parameters. For now I'd just like to get it to run