[REVIEW] Notebooks directory inside stripy source code
santisoler opened this issue · 13 comments
Part of openjournals/joss-reviews#1410
Leaving the source code of the package and the notebooks that show examples of its applications separated is a good practice in order to keep the repository better organized.
I found that the stripy
related notebooks are located inside /stripy-src/stripy/Notebooks
and linked on the /stripy/Notebooks
directory.
I think that moving the notebooks to the latter path would tidy up the repo a little bit, making easier to keep source code separated from documentation/examples.
Do you agree? Have you included the notebooks on that location for some special reason?
The original reason was to be able to bundle them standalone with the pypi installation and install notebooks after a pip install. We can probably move those to a different location but we usually do not have a separate repository for this level of documentation. I expect we can engineer for setup.py to reach up and out to get the notebooks from elsewhere in the repo.
Oh ok! I didn't notice that was the reason for the current location of the notebooks.
So, in light of this new reason let me rethink the issue.
The current location is not harmful, so we could consider it as a possible solution.
I don't quite like to force the setup.py
to find files outside its directory. You'll need to hack it and create a non very elegant solution.
I can see two more options here:
-
There no special reason why Notebooks should be released under pypi. They could be downloaded from
stripy
once it's already installed. For example, you could create a function that downloads the notebooks from your repo:import stripy stripy.download_notebooks()
On @fatiando we have a very nice tool for fetching files from the web called pooch. It's very friendly when the files are located on a GitHub repo: you could specify what version of the file you want.
-
An easier solution would be to refactor the organization of the repo. Instead of having all the source code related files inside
stripy-src
, you could move all those files to the root. So the repo would look something like this:├── binder ├── Docker ├── litho1pt0-src ├── Notebooks ├── paper ├── stripy │ ├── f77-src │ ├── src │ ├── __init__.py │ ├── cartesian_meshes.py │ ├── cartesian.py │ ├── documentation.py │ ├── spherical_meshes.py │ └── spherical.py ├── tests ├── COPYING ├── COPYING.LESSER ├── LICENCE ├── MANIFEST.in ├── README.md └── setup.py
With this organization, the
Notebooks
directory is located on the same level assetup.py
, so you could easily add them aspackage_data
to pack the notebooks on pypi.
Moreover, with this organization you would enable the installation of the latest version of the repo with pip (as I mentioned on #17):pip install https://github.com/underworldcode/stripy/archive/master.zip
Both solutions are not mutually excluding, so you could apply both simultaneously if you consider it convenient.
Let me know if you like any of them or if you could think a third possible scenario.
As usual, I'm willing to help you implementing any solution you decide to apply.
I personally prefer your second proposal. I think the original motivation for keeping everything tucked away in stripy-src
was to de-clutter the root directory, but in hindsight that has created unforeseen issues where license files have not been detected and symbolic links to README.md were required. If we are moving litho1pt0
source to another repository (#26) I see little need to store stripy source code in a subdirectory. @lmoresi what do you think?
Great! Because this refactor could involve major changes, I would recommend creating a PR to do it.
That way we can test if everything is working as expected before merging it.
I don’t think litho1pt0 deserves a publication as it is really just a “thin wrapper” which is discouraged by joss. We can give it a zenodo presence, though.
Ok. I didn't know litho1pt0
is not suited for a JOSS publication. But would be nice if it has its own repo along with a Zenodo doi.
Alright, I have created a restructure branch to deal with the moving everything to the top-level directory with this commit c735c0a. Changes include:
/stripy-src
->/
/stripy-src/stripy/Notebooks
->/Notebooks
@santisoler: how do you include the Notebooks as package_data
? I have the following in setup.py
:
package_data = {'stripy': ['Notebooks/CartesianTriangulations/*ipynb',
'Notebooks/SphericalTriangulations/*ipynb',
'Notebooks/Data/*'] },
But they are nowhere to be found under stripy package when it is built with pip
or distutils
. Could you shed some light on why this is happening?
Many thanks!
Yes it's in there - but why would you define package_data
if you didn't want it to be included!? Weird option.
Alright, I have created a restructure branch to deal with the moving everything to the top-level directory
Great @brmather! Would you mind opening a Pull Request for the restructure
branch so we could continue the discussion over there? The Pull Requests interface offers a lot of tools for discussing over the proposed code, like reviews, comments or even changes suggestion from any collaborator.
(In case you don't know how to do it, you could get very nice instructions from this GitHub article)
Instead of waiting for that PR, I'll try to answer here so I don't delay you.
I'm seeing that setup.py
import two setup
functions:
from setuptools import setup, find_packages
from numpy.distutils.core import setup, Extension
The script will ultimately use only the last one of them, so would be better to remove the first import.
I'm not used to numpy.distutils.core
, so I'm not so familiar with this specific setup.py
.
Is there a reason why you should import setuptools
as well?
According to this StackOverflow answer, if you want to use setuptools
you need to import it as a whole before importing the numpy.distutils.core
related functions and classes:
import setuptools
from numpy.distutils.core import setup, Extension
I'm sure you have more knowledge and experience than me on this subject. Please enlighten me.
@santisoler: how do you include the Notebooks as package_data?
But they are nowhere to be found under stripy package when it is built with pip or distutils. Could you shed some light on why this is happening?
Ok. Maybe my idea of leaving the Notebooks
outside the stripy
directory is not a good one because the files included on package_data
must be inside the package directory.
This is mainly due to the fact that we want the Notebooks to be distributed inside the pypi package.
So I think the best choice would be to keep the Notebooks
inside stripy
.
Nevertheless, I think we could still apply some of the changes of restructure
branch.
Maybe, we can update the repo structure to look like this:
├── binder
├── Docker
├── litho1pt0-src
├── paper
├── stripy
│ ├── f77-src
│ ├── Notebooks
│ ├── src
│ ├── __init__.py
│ ├── cartesian_meshes.py
│ ├── cartesian.py
│ ├── documentation.py
│ ├── spherical_meshes.py
│ └── spherical.py
├── tests
├── COPYING
├── COPYING.LESSER
├── LICENCE
├── MANIFEST.in
├── README.md
└── setup.py
Regarding @lmoresi comment about include_package_data = True
, the setuptools.py
documentation specify three ways for including data files:
include_package_data
Accept all data files and directories matched by MANIFEST.in.package_data
Specify additional patterns to match files that may or may not be matched by MANIFEST.in or found in source control.exclude_package_data
Specify patterns for data files and directories that should not be included when a package is installed, even if they would otherwise have been included due to the use of the preceding options.
So, I think if you choose to use the include_package_data
, the package_data
could be omitted. The same goes otherwise: if using package_data
, the include_package_data
could be omitted.
In conclusion, if you use package_data = {"..." = ["..."]}
you don't need to specify include_package_data
.
I'm familiar with setuptools
and the second option (using package_data
), but I'm not entirely sure if numpy.distutils.core
is really using setuptools
or just distutils
.
OK I moved the Notebooks back inside /stripy
again with b1a135d.
We need numpy.distutils.core
to compile fortran sources. As far as I'm aware setuptools
is not capable of doing that.
In your proposed directory restructure, you have kept litho1pt0-src
in there. I was under the impression this should be moved to a separate repository under the underworldcode group. Is this still the idea? I've removed it and all associated notebooks in this pull request #31
I think we do move it. Some folks like to have related projects together, some do not. However, this one was always an uncomfortable marriage and I prefer the unencumbered stripy to be honest.
In placing the Notebooks within stripy, I followed a pattern that has been advocated elsewhere for distributing documentation in this form. As we have seen from trying alternatives, no choice is perfect but this one does work and it does have some precedent.
When we get to that point, it seems to be a matter of taste and we can defend that with readers of JOSS and users of the code.
OK I moved the Notebooks back inside
/stripy
again with b1a135d.
👍
We need
numpy.distutils.core
to compile fortran sources. As far as I'm awaresetuptools
is not capable of doing that.
Yes, you're right. But, from what I've read, I understood that importing setuptools
would enhance the setup
function from numpy.distutils.core
. My main concern about this was not functionality, but the presence of two setup
functions on the same script, what could cause problems in the future if you don't remember exactly which one you must use.
In your proposed directory restructure, you have kept
litho1pt0-src
in there. I was under the impression this should be moved to a separate repository under the underworldcode group. Is this still the idea? I've removed it and all associated notebooks in this pull request #31
Yes. Sorry for that, maybe I should have remove it from the structure.
My plan was to solve the notebooks location on this issue, leaving the presence of litho1pt0
for #26.
That's why I left the litho1pt0
on the proposed structure, just to make the two issues independent of each other.
In placing the Notebooks within stripy, I followed a pattern that has been advocated elsewhere for distributing documentation in this form. As we have seen from trying alternatives, no choice is perfect but this one does work and it does have some precedent.
When we get to that point, it seems to be a matter of taste and we can defend that with readers of JOSS and users of the code.
I agree with that solution. Sometimes we open an issue that won't ultimately solve any problem, but make us rethink why we have designed some piece of software as it is and why it's the best solution we can find.
I think this kind of discussion make software more robust.