Do we still want to store SVO data in the repo?
Closed this issue ยท 7 comments
79ef96d added quite a large amount of data to the repo, that originally was retrieved and cached (at least in theory) from SVO.
I'm now generally looking into the whole SVO infrastructure we have in effects.ter_curves_utils
, with the ultimate goal of getting rid of one of the last remaining astropy-based downloads (PR coming soon), and I'm now wondering if storing all this data in the repo and actually packaging it with ScopeSim is (still) the best approach. Thoughts @hugobuddel ?
Perhaps we can move the data to ScopeSim_Data and make that a dependency of ScopeSim. (ScopeSim-the-application that is; ScopeSim-the-library might not need it).
But I don't want to remove the data entirely, because we (apparently) cannot trust the SVO to be online. I had to scour old astropy caches to find the data in that commit, and I don't want us to be in that situation again. Also, I want the essential part of our software to work without internet access.
But I don't want to remove the data entirely, because we (apparently) cannot trust the SVO to be online. I had to scour old astropy caches to find the data in that commit, and I don't want us to be in that situation again. Also, I want the essential part of our software to work without internet access.
Yeah, I can see that. Though I'm considering renaming the files to .xml
(which is what they are anyway) because the way they're named now confuses Windows (e.g. HAWKI.H
, Windows thinks it's a C/C++ Header
) and also GitHub (e.g. HAWKI.Ks
, GitHub says on ScopeSim's repo page we're using KerboScript, which is a bit odd). And then ofc adapt a few lines so ScopeSim still finds those files, but that shouldn't be too much of a hassle.
Also I need to sort out the caching while removing the astropy-based download there. I'd like to avoid saving files to the installed package, wherever that is, and rather use a "neutral" cache location (which is done indeed by the astropy downloads, but just saying...).
Perhaps we can move the data to ScopeSim_Data and make that a dependency of ScopeSim. (ScopeSim-the-application that is; ScopeSim-the-library might not need it).
Eventually something along those lines might be good...
Though I'm considering renaming the files to
.xml
(which is what they are anyway) because the way they're named now confuses Windows (e.g.HAWKI.H
, Windows thinks it's aC/C++ Header
) and also GitHub (e.g.HAWKI.Ks
, GitHub says on ScopeSim's repo page we're using KerboScript, which is a bit odd). And then ofc adapt a few lines so ScopeSim still finds those files, but that shouldn't be too much of a hassle.
OK renaming the files is fine. I didn't actually pay attention to that.
Also I need to sort out the caching while removing the astropy-based download there. I'd like to avoid saving files to the installed package, wherever that is, and rather use a "neutral" cache location (which is done indeed by the astropy downloads, but just saying...).
I propose to have the same three-layer structure as I did for spextra (IIRC):
- Use a path explicitly set by the user
- Use the ScopeSim_Data directory if it is installed
- Use a default path
Using the ScopeSim_Data directory would be particularly useful to find already cached data, but in current form it might be less useful for storing newly retrieved data, because that would (IIRC) effectively be storing data in the site-packages dir, which is bad.
Perhaps we can move the data to ScopeSim_Data and make that a dependency of ScopeSim. (ScopeSim-the-application that is; ScopeSim-the-library might not need it).
Eventually something along those lines might be good...
We can take steps towards that. Maybe we can have ScopeSim_Data move its data to somewhere else then the site-package directory? Or have two location?
I propose to have the same three-layer structure as I did for spextra (IIRC):
- Use a path explicitly set by the user
- Use the ScopeSim_Data directory if it is installed
- Use a default path
That's what skycalc_ipy does now, we could indeed standardize on that, ideally with the slight twist, as you mentioned, that ScopeSim_Data should do something other than modify the site-package directory...
Oh yeah, skycalc_ipy
, because we can also not trust ESO :-).
One note: downloading to the ScopeSim_Data directory is also a feature. The nightly job of ScopeSim_Data installs ScopeSim_Data with pip install -e .
and then everything is downloaded into the git clone. The job will subsequently create a pull request if there is any new data. So while the behaviour is bad for users, it is also essential ๐ . But we can manage that.
One note: downloading to the ScopeSim_Data directory is also a feature. The nightly job of ScopeSim_Data installs ScopeSim_Data with
pip install -e .
and then everything is downloaded into the git clone. The job will subsequently create a pull request if there is any new data. So while the behaviour is bad for users, it is also essential ๐ . But we can manage that.
That's fine because ScopeSim_Data is not a PyPI package (yet? idk)...
I think we can make ScopeSim_Data a PyPI package. I initially intended it just for internal use, but I have recommended the package to others as well, so making it a PyPI package makes sense to me.