lilab-bcb/cirrocumulus

[cirro-server] open dataset from url path

yuhuyoyo opened this issue · 6 comments

Is your feature request related to a problem? Please describe.
e.g. i have a cirrocumulus server running and I mounted the data to ./cirro-data. such as /cirro_data/pbmc3k.zarr
Then I'd like to open it by appending http://localhost:3000/cirro_data/pbmc3k.zarr to the path or as a query param to directly open it.

Currently I need to import the dataset and then open it from inside the UI.

Describe the solution you'd like
I'd like to open it by appending http://localhost:3000/cirro_data/pbmc3k.zarr to the path or as a query param to directly open it.

As a workaround, you can programmatically add a dataset by posting to the /api/dataset endpoint:

curl http://localhost:5000/api/dataset -X POST -F 'name=my_name' -F 'url=data/my_dataset_path' -F 'description=my_desc' -F 'species=Mus musculus'

Then you can link to the added dataset by id.

i can see that's useful if I'm importing my data from somewhere else. but if I already have the data mounted as volume inside the container, why do i still need to do this step?

also when you say link to the added dataset by id, do you mean i can go there through a path based routing (e.g. localhost:5000/datasets/<my_dataset_name>) or do you mean from the open button in the UI?

i'm fine with curling /api/dataset to add the dataset. but I'd also like to be able to get to it directly via localhost:5000/datasets/<my_dataset_name> as a follow-up step. i'm happy to contribute to the code too if you think the request is reasonable.

alternatively, we could have dynamic routing:

  1. pass in the root dir as env var, e.g. ./cirro_data
  2. find all the supported files inside and handle them as dataset.

After you add your dataset, you can link to the dataset by id. For example:
http://127.0.0.1:5000/#q={"dataset":"65cfa6fcddd71edd9e7ff6d4"}

You can retrieve all datasets on the server using the datasets endpoint (e.g. http://127.0.0.1:5000/api/datasets)
Does this work for your use case? Thanks.

thanks Josh, I can work with that :)

Hi josh, circling back here.

can we support [http://127.0.0.1:5000/#q={"url":"/path/to/*.zarr"}] too? any reason we are not forcing uniqueness based on the dataset path?