sagemath/sage

Eliminate use of __init__.py files for supplying package docstrings

mkoeppe opened this issue · 68 comments

... as in src/sage/categories/__init__.py or src/sage/combinat/__init__.py

(part of #32501)

CC: @tscrim @kwankyu @nthiery

Component: refactoring

Author: Kwankyu Lee

Branch: ee4367d

Reviewer: Matthias Koeppe

Issue created by migration from https://trac.sagemath.org/ticket/32508

Author: Kwankyu Lee

Branch pushed to git repo; I updated commit sha1. New commits:

5dcb3f5Remove docstrings from __init__.py
comment:4

Previously

sage: import sage.combinat
sage: sage.combinat.quickref?

worked. Not anymore with the patch. We should

sage: import sage.combinat.quickref
sage: sage.combinat.quickref?

The same with sage.combinat.tutorial

comment:5

More seriously, before the patch,

sage: import sage.combinat
sage: help(sage.combinat)

gives a nice ... help. Could we retain this functionality somehow while eliminating the content of sage.combinat.__init__.py?

comment:6

I guess with a bit of extra code in src/sage/misc/sagedoc.py, where help is defined, we could do this

comment:7

Replying to @mkoeppe:

I guess with a bit of extra code in src/sage/misc/sagedoc.py, where help is defined, we could do this

I cannot guess how. Where should the content go? Do you imagine a file like src/sage/combinat/doc.rst or something?

comment:8

src/sage/combinat/__doc__.py would be better name for that.

comment:9

+1 on __doc__.

comment:10

Replying to @mkoeppe:

+1 on __doc__.

Do you mean __doc__ without .py extension?

comment:11

With .py, like you suggested

comment:12

Then __doc__.py with a python docstring ("""...""")?

comment:13

On the other hand, I wonder if there is a solution(or recommendation) for this problem(package docstring) in python level as this seems a general issue with namespace packages.

comment:14

Replying to @kwankyu:

Then __doc__.py with a python docstring ("""...""")?

Yes, that's what I thought

comment:15

Replying to @kwankyu:

On the other hand, I wonder if there is a solution(or recommendation) for this problem(package docstring) in python level as this seems a general issue with namespace packages.

I agree, it would be good to find out whether there are existing practices for this in the Python community.

comment:16

Replying to @mkoeppe:

Replying to @kwankyu:

Then __doc__.py with a python docstring ("""...""")?

Yes, that's what I thought

One problem with __doc__.py is that it is a fake module for it cannot be imported.

How about this:

src/sage/combinat/__init__/__init__.py

Then the namespace package sage.combinat get to have a subpackage __init__ initialized by __init__.py, which we perhaps may use for "some initialization" as well as for package docstring.

Is this too much or incompatible with the modularization plan?

comment:17

Replying to @mkoeppe:

I agree, it would be good to find out whether there are existing practices for this in the Python community.

I googled for it, but found none.

comment:18

How about we just use all.py for this purpose?

(Modularized packages such as sagemath-polyhedra (#32432) add files such as all__sagemath_polyhedra.py)

comment:19

Replying to @mkoeppe:

How about we just use all.py for this purpose?

(Modularized packages such as sagemath-polyhedra (#32432) add files such as all__sagemath_polyhedra.py)

How does all.py work?

(1) Does it exist by itself and import other all__xxx files into it?

(2) Or is it dynamically created by combining all__xxx files?

Anyway, if it is at src/sage/combinat/all.py, then it cannot be imported. Can it be?

comment:20

Yes, all.py exists as a file in Sage. For namespace packages, I am modularizing it by splitting parts of the existing all.py out to all__sagemath_objects.py, all__sagemath_categories.py, all__sagemath_polyhedra etc. When a distribution has another distribution as a dependency, then it imports the dependency's all__... module. For example, sagemath-polyhedra has sagemath-categories as a dependency, so you see the imports: https://github.com/sagemath/sagetrac-mirror/blob/a58ea1f6decbb84102bd341f8b3cc20f3147452b/src/sage/all__sagemath_polyhedra.py&id2=f716a0b366e31bbb546230140489244cfb68390d and https://github.com/sagemath/sagetrac-mirror/blob/a58ea1f6decbb84102bd341f8b3cc20f3147452b/src/sage/rings/all__sagemath_polyhedra.py&id2=f716a0b366e31bbb546230140489244cfb68390d

comment:21

Generally, if the all module is present, then it can also be imported.
Likewise, if an all__... module is present, then it can also be imported.

comment:22

If sage.misc.sagedoc.help is used on a package P without __doc__ attribute, it could try to import everything in dir(P) whose name begins with all and accumulate the docstrings of the importable modules.

comment:23

Replying to @mkoeppe:

Generally, if the all module is present, then it can also be imported.
Likewise, if an all__... module is present, then it can also be imported.

Hmm. if the all module is just below in a namespace package (without __init__.py), I cannot import it. Do I misunderstand something?

For experiment, I created src/sage/namespace/all.py but no src/sage/namespace/__init__.py. I get

sage: import sage.namespace.all
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-1-f30e99bc2b64> in <module>
----> 1 import sage.namespace.all

ModuleNotFoundError: No module named 'sage.namespace.all'
comment:24

The current default build system of the Sage library does not actually understand namespace packages. If you look into the installed Sage library in site-packages, you will see that your file all.py has not been copied there. This will be fixed in #28925.

If you want to test this, you can use ./configure --enable-editable to switch to a different build system (https://wiki.sagemath.org/ReleaseTours/sage-9.3#Editable_.28.22in-place.22.2C_.22develop.22.29_installs_of_the_Sage_library)

comment:25

Replying to @mkoeppe:

The current default build system of the Sage library does not actually understand namespace packages. If you look into the installed Sage library in site-packages, you will see that your file all.py has not been copied there. This will be fixed in #28925.

If you want to test this, you can use ./configure --enable-editable to switch to a different build system (https://wiki.sagemath.org/ReleaseTours/sage-9.3#Editable_.28.22in-place.22.2C_.22develop.22.29_installs_of_the_Sage_library)

Thanks. It works.

comment:26

Replying to @mkoeppe:

How about we just use all.py for this purpose?

I disagree. all.py serves its own purpose and may have its own docstring.

On the other hand, __doc__.py is somewhat misleading as it appears "official" as much as __init__.py, which is not true.

Now I suggest doc.py to store package docstrings. This is just a sage thing.

comment:27

For the use of doc.py:

It seems that the package docstrings in __init__.py files are mostly intended for our Sphinx documentation, not for using with help(...).

So in this ticket, I would focus only on Sphinx documentation and ignore possible use with help(). If really needed, the use with help() may be dealt with in other ticket in future.

comment:28

Replying to @kwankyu:

Replying to @mkoeppe:

How about we just use all.py for this purpose?

I disagree. all.py serves its own purpose and may have its own docstring.

Do you see an example of an all module that has a useful docstring?

comment:29

Replying to @mkoeppe:

Replying to @kwankyu:

Replying to @mkoeppe:

How about we just use all.py for this purpose?

I disagree. all.py serves its own purpose and may have its own docstring.

Do you see an example of an all module that has a useful docstring?

I didn't examine all, but some all modules have docstrings. Useful or not, I don't know.

I think we overload all.py if we add package docstring to it...

comment:30

I advocate doc.py since

sage.combinat.doc

reads better than

sage.combinat.all

in documentation.

comment:31

doc is fine with me

comment:32

Working on this, I now realize that what this ticket aims at goes against what the ticket #16256 achieved, which systematically used __init__.py for package docstrings. So whatever we do here has possibility to make the combinat people unhappy.

For example,
sage.combinat.crystals? will not work. Instead

sage: import sage.combinat.crystals.doc
sage: sage.combinat.crystals.doc?

would work.

Travis, what do you think?

comment:33

While I don't use the feature sage.combinat?, I can certainly see it as being useful. If we can get this to continue working, then I would have no objections.

I am cc-ing Nicolas to see if he has any comments.

comment:34

Replying to @tscrim:

While I don't use the feature sage.combinat?, I can certainly see it as being useful. If we can get this to continue working, then I would have no objections.

It seems that we cannot get that to continue working unless we patch IPython, which perhaps we don't want to.

comment:35

Thanks for working on Sage's modularization and for investigating for
solutions!

I see the technical hurdles behind, and don't have a good solution to offer,
but I advocate for finding some technical mean to keep the functionality package? to retrieve
information about the package. This is consistent with how users retrieve
information by introspection for all other types of objects (functions, ...),
and when it's not there I have witnessed countless users during classes or
workshops struggle to find an entry point for the documentation (having
to search for the proper submodule; or leave the interpreter to search
around from the ref manual or browsing the web).

This is indeed not a widely adopted idiom in the Python world, yet this is so
natural that I believe we should push for it.

Thanks in advance,

comment:36

Thanks for all comments. I agree that thinking of the ? functionality is important.

I think an easy solution is to have all.py put a __doc__ attribute into its containing package.

Changed commit from 5dcb3f5 to d7085db

Changed branch from u/klee/32508 to public/32508

New commits:

d7085dbMove package docstrings out of __init__.py
comment:38

Replying to @mkoeppe:

Thanks for all comments. I agree that thinking of the ? functionality is important.

I think an easy solution is to have all.py put a __doc__ attribute into its containing package.

That works. Thanks!

comment:39

Replying to @mkoeppe:

I disagree. all.py serves its own purpose and may have its own docstring.

Do you see an example of an all module that has a useful docstring?

I examined all. No I don't. Using all now seems a simpler solution than introducing a new file doc.py. Even when it has a significant docstring, we can still use all to have the package docstring along with it.

comment:40

This looks great to me. Ready for review?

comment:41

Great to have a simple solution! all.py is not super standard in the
Python world, but since it's a long time idiom to use it in Sage,
we might indeed as well use it for the package docstrings too.
Also, since <package>.__doc__ is only used at runtime and not during
package loading; so having a little dynamic monkey patching of the package
causes no risk.

Hence +1 on my side on the principle. Thanks!

comment:42

Replying to @mkoeppe:

This looks great to me. Ready for review?

We are supposed to deal with all __init__ files. No?

Now that we have a solution, I think we need to polish the code, especially as it would be used as an idiom globally in Sage.

Any ideas for improvement, in names or organization, are welcome.

comment:43

Replying to @nthiery:

Great to have a simple solution! all.py is not super standard in the
Python world, but since it's a long time idiom to use it in Sage,
we might indeed as well use it for the package docstrings too.

There seems no better place for (namespace) package docstrings. But I am open for better ideas.

comment:44

I don't see a reason why all.py should have its own docstring either. I am +1 for using that.

comment:45

Replying to @kwankyu:

Replying to @mkoeppe:

This looks great to me. Ready for review?

We are supposed to deal with all __init__ files. No?

Not necessarily. Only the __init__.py files of packages that will become namespace packages. See explanation in #32501 (also related to our previous discussion in #32734 comment:13).

comment:46

Re-running the command from the ticket description of #32501 on the present branch:

$ find src/sage -name '__init__.py' | xargs wc -l | grep -v '^ *0'
       2 src/sage/crypto/__init__.py
       3 src/sage/crypto/mq/__init__.py
       3 src/sage/dynamics/__init__.py
       3 src/sage/dynamics/cellular_automata/__init__.py
       1 src/sage/combinat/words/__init__.py
       1 src/sage/combinat/designs/__init__.py
       1 src/sage/combinat/chas/__init__.py
       1 src/sage/combinat/path_tableaux/__init__.py
       1 src/sage/combinat/species/__init__.py
       1 src/sage/combinat/sf/__init__.py
       1 src/sage/combinat/ncsf_qsym/__init__.py
       6 src/sage/combinat/integer_lists/__init__.py
       9 src/sage/combinat/root_system/__init__.py
       3 src/sage/doctest/__init__.py
     847 src/sage/features/__init__.py
      15 src/sage/repl/__init__.py
       4 src/sage/repl/rich_output/__init__.py
      34 src/sage/__init__.py
     358 src/sage/libs/giac/__init__.py
       2 src/sage/libs/ntl/__init__.py
       1 src/sage/libs/gap/__init__.py
     199 src/sage/libs/pari/__init__.py
      55 src/sage/cpython/__init__.py
       2 src/sage/sat/converters/__init__.py
       4 src/sage/sat/solvers/__init__.py
       1 src/sage/modular/quasimodform/__init__.py
       2 src/sage/finance/__init__.py
       2 src/sage/matroids/__init__.py
     114 src/sage/rings/polynomial/pbori/__init__.py
      10 src/sage/modules/with_basis/__init__.py
       2 src/sage/structure/__init__.py
       2 src/sage/media/__init__.py
comment:47

To do:

  • src/sage/dynamics/__init__.py - because sage.dynamics.complex_dynamics needs sage.symbolic and everything else does not, see #32601
  • src/sage/combinat/root_system/__init__.py - because git grep 'cimport' src/sage/combinat/root_system/ reveals from sage.groups.perm_gps.permgroup_element cimport PermutationGroupElement, and git grep 'cimport' src/sage/groups/perm_gps shows dependencies on numerous libraries: gap, flint
  • src/sage/matroids - to play safe given that there's a lot of cimporting

Not necessary to change:

  • src/sage/combinat/words/__init__.py because git grep 'cimport' src/sage/combinat/words does not show compile-time dependencies on libraries, so sage.combinat.words does not have to become a namespace package
  • likewise src/sage/combinat/integer_lists/__init__.py
  • src/sage/sat/... because git grep 'cimport' src/sage/sat/ is empty
  • src/sage/features/, src/sage/libs/..., src/sage/repl because they will definitely not become namespace packages

Changed commit from d7085db to b7d2be9

Branch pushed to git repo; I updated commit sha1. New commits:

b7d2be9More refactoring of `__init__` files

New commits:

b7d2be9More refactoring of `__init__` files
comment:50

Re-running the command from the ticket description of #32501 on the present branch:

$ find src/sage -name '__init__.py' | xargs wc -l | grep -v '^ *0'
       2 src/sage/crypto/__init__.py
       3 src/sage/crypto/mq/__init__.py
       3 src/sage/dynamics/cellular_automata/__init__.py
      41 src/sage/combinat/words/__init__.py
       1 src/sage/combinat/designs/__init__.py
       1 src/sage/combinat/chas/__init__.py
       1 src/sage/combinat/path_tableaux/__init__.py
       1 src/sage/combinat/species/__init__.py
       1 src/sage/combinat/sf/__init__.py
       1 src/sage/combinat/ncsf_qsym/__init__.py
       6 src/sage/combinat/integer_lists/__init__.py
       9 src/sage/combinat/root_system/__init__.py
       3 src/sage/doctest/__init__.py
     847 src/sage/features/__init__.py
      15 src/sage/repl/__init__.py
       4 src/sage/repl/rich_output/__init__.py
      34 src/sage/__init__.py
     358 src/sage/libs/giac/__init__.py
       2 src/sage/libs/ntl/__init__.py
       1 src/sage/libs/gap/__init__.py
     199 src/sage/libs/pari/__init__.py
      55 src/sage/cpython/__init__.py
       2 src/sage/sat/converters/__init__.py
       4 src/sage/sat/solvers/__init__.py
       1 src/sage/modular/quasimodform/__init__.py
       2 src/sage/finance/__init__.py
       2 src/sage/matroids/__init__.py
     114 src/sage/rings/polynomial/pbori/__init__.py
      10 src/sage/modules/with_basis/__init__.py
       2 src/sage/structure/__init__.py
       2 src/sage/media/__init__.py
    1727 total

Reviewer: Matthias Koeppe

comment:52

Looking great but I'm getting a docbuild error:

[dochtml] [combinat ] /Users/mkoeppe/s/sage/sage-rebasing/local/lib/python3.9/site-packages/sage/combinat/crystals/crystals.py:docstring of sage.combinat.crystals.crystals:122: WARNING: undefined label: sage.combinat.crystals
[dochtml] [combinat ] /Users/mkoeppe/s/sage/sage-rebasing/local/lib/python3.9/site-packages/sage/combinat/quickref.py:docstring of sage.combinat.quickref:48: WARNING: undefined label: sage.combinat.root_system
[dochtml] [combinat ] /Users/mkoeppe/s/sage/sage-rebasing/local/lib/python3.9/site-packages/sage/combinat/quickref.py:docstring of sage.combinat.quickref:54: WARNING: undefined label: sage.combinat.crystals
[dochtml] [combinat ] /Users/mkoeppe/s/sage/sage-rebasing/local/lib/python3.9/site-packages/sage/combinat/quickref.py:docstring of sage.combinat.quickref:71: WARNING: undefined label: sage.combinat.posets
[dochtml] [combinat ] /Users/mkoeppe/s/sage/sage-rebasing/local/lib/python3.9/site-packages/sage/combinat/root_system/root_system.py:docstring of sage.combinat.root_system.root_system:3: WARNING: undefined label: sage.combinat.root_system
[dochtml] [combinat ] dumping search index in English (code: en)... done
[dochtml] [combinat ] The HTML pages are in local/share/doc/sage/html/en/reference/combinat.
[dochtml] Error building the documentation.

Branch pushed to git repo; I updated commit sha1. New commits:

ee4367dFixes for doctest failures

Changed commit from b7d2be9 to ee4367d

comment:54

Strange that those errors got slipped in unnoticed...

comment:55

It's working now, and the built documentation also looks fine. Thanks!

comment:56

Looks good indeed.

Just one thing (that could possibly go in a follow up
ticket): would there be a way to maintain the validity
of cross-references like :ref:sage.combinat ?

First because it's easier to remember, and does not
reveal an implementation detail (that the doc is actually
stored in sage/combinat/all.py).

But also because, thanks to intersphinx links these
cross-references can be used from documents outside
of the Sage documentation, and this is breaking
backward compatibility for them.

Thanks,

comment:57

Replying to @nthiery:

Looks good indeed.

Just one thing (that could possibly go in a follow up
ticket): would there be a way to maintain the validity
of cross-references like :ref:sage.combinat ?

If you compare "before" and "after" below, we see that Sphinx recognizes regular package and treat it appropriately. Hence I think we need to wait until Sphinx gets equipped with some mechanism to deal with namespace packages appropriately. It seems that Sphinx does not have such a mechanism yet (as far as I know by googling).


Before the patch: this line

.. toctree::    

    sage/combinat/__init__   

generates

.. nodoctest

.. _sage.combinat:

Combinatorics
=============

.. This file has been autogenerated.


.. automodule:: sage.combinat.__init__
   :members:
   :undoc-members:
   :show-inheritance:   

After the patch: this line

.. toctree::    

    sage/combinat/all  

generates

.. nodoctest

.. _sage.combinat.all:

Combinatorics
=============

.. This file has been autogenerated.


.. automodule:: sage.combinat.all
   :members:
   :undoc-members:
   :show-inheritance:  

Changed branch from public/32508 to ee4367d

Changed commit from ee4367d to none

comment:59

The addition of

from . import quickref, tutorial

to sage.combinat.all broke the top-level tutorial() command. Maybe that command should be removed anyway, especially since no one has reported its loss, but in any case, right now tutorial() is broken as a command in Sage.

comment:60

Replying to @jhpalmieri:

The addition of

from . import quickref, tutorial

to sage.combinat.all broke the top-level tutorial() command. Maybe that command should be removed anyway, especially since no one has reported its loss, but in any case, right now tutorial() is broken as a command in Sage.

After 7 months, I don't remember why I put that line into all.py. Simply removing that line would solve the problem. Yes? We don't need quickref either in the global name space.

comment:61

Indeed, there is no reason for quickref and tutorial to be in the global namespace.
All we need is to be able to do sage.combinat? sage.combinat.tutorial? and sage.combinat.quickref?

comment:62

See #33933.