feat: add `gosling.datasets`
manzt opened this issue · 4 comments
It might be nice to add some convenience exports for reusable example datasets for gosling. This could remove some of the boilerplate in the examples for:
import gosling as gos
- from gosling.data import multivec
+ from gosling.datasets import cistrome_multivec
- data = multivec(
- url="https://server.gosling-lang.org/api/v1/tileset_info/?d=cistrome-multivec",
- row="sample",
- column="position",
- value="peak",
- categories=["sample 1", "sample 2", "sample 3", "sample 4"],
- binSize=5,
- )
- base_track = gos.Track(data, width=800, height=100)
+ base_track = gos.Track(cistrome_multivec, width=800, height=100)
This would be really useful! We can refer to the list of public data used in JS editor:
Great, thank you! It would make sense to export the "complete" datasets, unless these urls can be interpreted differently.
Eg.
cistrome = multivec(
url="https://server.gosling-lang.org/api/v1/tileset_info/?d=cistrome-multivec",
row="sample",
column="position",
value="peak",
categories=["sample 1", "sample 2", "sample 3", "sample 4"],
binSize=5,
)
vs:
cistrome = "https://server.gosling-lang.org/api/v1/tileset_info/?d=cistrome-multivec"
Yes, I think it makes sense since those datasets are used across multiple examples with the same data configs. I guess one can also use cistrome['url']
to access the URL and use different configs.
In retrospect, I don't think we should have "magic" datasets in Gos. it somewhat obscures The use of the API, and might be confusing to new users. I'm going to close for now.