sagemath/sage

Interfaces: use more lazy imports, restore top-level functions maxima_console etc.

Closed this issue · 68 comments

Various files in sage/interfaces/ have this pattern:

class Octave(...):
    ....

octave = Octave()

When combined with from .octave import Octave, octave in sage/interfaces/all.py, this means that an instance of Octave is created when Sage starts up. I think we should avoid this with lazy imports.

Depends on #16522

Component: interfaces

Author: John Palmieri

Branch: 09f5d92

Reviewer: Matthias Koeppe

Issue created by migration from https://trac.sagemath.org/ticket/34547

comment:1

+1

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

b483758trac 34547: use lazy imports in much of sage.interfaces.all
comment:4

Here is a first attempt. I'll mark it as ready for review, but I am certainly open to changes. If I made changes for the imports for magma and polymake, I got problems with pickling, so I have left those as they were before.

comment:5

Regarding the change

-interfaces = ['gap', 'gap3', 'giac', 'gp', 'mathematica', 'gnuplot', \
-              'kash', 'magma', 'macaulay2', 'maple', 'maxima', \
-              'mathematica', 'mwrank', 'octave', 'r', \
-              'singular', 'sage0', 'sage']

Do we need to deprecate this before just removing it? I don't know what purpose it was supposed to serve, except perhaps it is the remnants of some old code that once looped through it.

Changed commit from b483758 to 250da03

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

250da03trac 34547: use lazy imports in much of sage.interfaces.all

Description changed:

--- 
+++ 
@@ -6,5 +6,5 @@
 
 octave = Octave()
 ```
-When combined with `from .octave import Octave, octave` in sage/interfaces/all.py`, this means that an instance of `Octave` is created when Sage starts up. I think we should avoid this with lazy imports.
+When combined with `from .octave import Octave, octave` in `sage/interfaces/all.py`, this means that an instance of `Octave` is created when Sage starts up. I think we should avoid this with lazy imports.
 
comment:8

Where do the ".all" namespaces actually get used and for what? Are they just to trigger initialization of modules? Or do people use them to access symbols too? They certainly be importing symbols from them. Especially with lazy imports, that leads to undesirable side-effects (see e.g. #16522):

sage: type(maxima_calculus)
<class 'sage.misc.lazy_import.LazyImport'>
sage: type(sage.calculus.calculus.maxima)
<class 'sage.misc.lazy_import.LazyImport'>
sage: sage.calculus.calculus.maxima is maxima_calculus  # they are the same object
True
sage: maxima_calculus(1)
1
sage: type(maxima_calculus)
<class 'sage.misc.lazy_import.LazyImport'>
sage: type(sage.calculus.calculus.maxima)
<class 'sage.interfaces.maxima_lib.MaximaLib'>

So as you can see, we end up with a LazyImport object bound in two places. Calling it removes one of the shims but not the other (it can't! it doesn't have a reference to the relevant namespace!).
This object rebinding is only tried upon the first resolution of the lazy object, but it does mean that the shim in the wrong location stays in place (and these shims are not perfect replacements for their objects)

The delayed instantiation of interfaces is laudable but, as you can see, can introduce new problems. What does work is lazy importing the object where you need it, but I'm not so sure we need these objects at the 'all' level. Certainly the from <...>.all import * done in sage.all propagates this problem. The tool clean_namespace at the branch at #16522 can clean it up after import ...

comment:9

Replying to Nils Bruin:

Where do the ".all" namespaces actually get used and for what? Are they just to trigger initialization of modules?

sage.interfaces is a namespace package, so we're getting rid of imports from sage.interfaces.all within the Sage library, see #34201. When this is done, all that will remain is the reimport into sage.all, to fill the interactive top level as the only purpose.

comment:10

Right now, git grep "from.*interfaces.all.*import" src/sage only reveals the import in sage.all. There are a few matches for git grep "import.*interfaces.all" src/sage.

comment:11

Maybe we need to keep the one in src/sage/repl/interface_magic.py?

comment:12

Replying to Matthias Köppe:

Replying to Nils Bruin:

Where do the ".all" namespaces actually get used and for what? Are they just to trigger initialization of modules?

sage.interfaces is a namespace package, so we're getting rid of imports from sage.interfaces.all within the Sage library, see #34201. When this is done, all that will remain is the reimport into sage.all, to fill the interactive top level as the only purpose.

Ah, OK, so our current infrastructure actually guarantees that LazyImports in sage.all end up broken in the toplevel. Possible fixes would be to build the global namespace piecemeal and recreate appropriate LazyImport objects rather than just referencing them, or just do the copy and then clean them up with something like clean_namespace (see #16522). The latter is much easier to implement. It looks like cleaning sage.all and the toplevel namespace upon construction would give us a lot of benefit for relatively low effort.

One reason to ensure LazyImport shims don't stay lying around, especially at top level, is that documentation is only partially transparently reported. Compare:

sage: sage.calculus.calculus.maxima? # reports doc but with wrong type, signature, and source file
sage: sage.calculus.calculus.maxima? # second time is OK

versus

sage: maxima_calculus? # never gets type, signature, and source file right

Note:

sage: maxima_calculus?? # finds source but is still missing type and signature
comment:13

Replying to Nils Bruin:

so our current infrastructure actually guarantees that LazyImports in sage.all end up broken in the toplevel. Possible fixes would be to [...] or just do the copy and then clean them up with something like clean_namespace (see #16522). The latter is much easier to implement. It looks like cleaning sage.all and the toplevel namespace upon construction would give us a lot of benefit for relatively low effort.

Thanks for the pointer to #16522. I haven't looked at the details, but this sounds promising.

Branch pushed to git repo; I updated commit sha1. New commits:

9566546trac 34547: remove two "import sage.interfaces.all" lines

Changed commit from 250da03 to 9566546

comment:15

Is this branch okay, or do we need to do something to accommodate #16522?

comment:16

Are these changes:

--- a/src/sage/interfaces/axiom.py
+++ b/src/sage/interfaces/axiom.py
@@ -490,7 +490,7 @@ class Axiom(PanAxiom):
         """
         EXAMPLES::
 
-            sage: axiom.__reduce__()
+            sage: Axiom().__reduce__()
             (<function reduce_load_Axiom at 0x...>, ())

needed because of #16522?

comment:17

I see the following with this branch:

sage: axiom.__reduce__
<built-in method __reduce__ of sage.misc.lazy_import.LazyImport object at 0x19a6991c0>
sage: Axiom().__reduce__
<bound method Axiom.__reduce__ of Axiom>
sage: axiom.__reduce__()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [7], line 1
----> 1 axiom.__reduce__()

File /usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/copyreg.py:76, in _reduce_ex(self, proto)
     74 else:
     75     if base is cls:
---> 76         raise TypeError(f"cannot pickle {cls.__name__!r} object")
     77     state = base(self)
     78 args = (cls, base, state)

TypeError: cannot pickle 'LazyImport' object
sage: Axiom().__reduce__()
(<function reduce_load_Axiom at 0x19d557c70>, ())

I don't understand why this change is necessary.

comment:18

Maybe it is #16522.

comment:19

Yes, I think this should be solved via #16522 instead of working around it in the doctests

Dependencies: #16522

comment:20

I don't think #16522 will prevent this problem: I don't think special methods like __reduce__, which are present of LazyImport objects, trigger resolution. According to https://docs.python.org/3/reference/datamodel.html#object.getattr, the __getattr__ method, which is the one LazyImport uses, only gets invoked when normal attribute lookup fails.

This is why most slot methods are separately wrapped to trigger resolution, because these would never fail.

comment:21

The __reduce__ doctests look pretty artificial. Would there be more meaningful ones that we could use as replacements?

comment:22

By the way, an alternate approach to this ticket would be to remove the top-level imports of some of these packages altogether: make users explicitly import them when needed. Potentially affected packages, all from src/interfaces:

  • axiom
  • ecm
  • 4ti2
  • fricas
  • frobby
  • gap3
  • genus2reduction
  • gfan
  • giac (pexpect interface)
  • gnuplot
  • kash
  • lie
  • lisp
  • macaulay2
  • magma (not currently changed by this ticket)
  • magma_free
  • maple
  • mathematica
  • mathics
  • matlab
  • mupad
  • mwrank
  • octave
  • polymake (not currently changed by this ticket)
  • povray
  • psage
  • qepcad
  • qsieve
  • r
  • read_data
  • scilab
  • tachyon
    These do not have to be treated all the same, of course.
comment:23

Replying to Nils Bruin:

I don't think #16522 will prevent this problem

You are right.

Reviewer: Matthias Koeppe

comment:24

LGTM.

comment:25

But maybe we should also take care of these ones in the same ticket:

        from .axiom import axiom_console
        from .fricas import fricas_console
        from .gap import gap_console
...
comment:26

Can you explain when those imports are triggered? I see this with vanilla Sage:

sage: from sage.repl.rich_output.display_manager import get_display_manager as _get_display_manager
sage: _get_display_manager().is_in_terminal()
True
sage: axiom_console()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In [5], line 1
----> 1 axiom_console()

NameError: name 'axiom_console' is not defined
sage: gap_console()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In [7], line 1
----> 1 gap_console()

NameError: name 'gap_console' is not defined
comment:27

They are supposed to be defined in terminal sessions. But they are missing already in 9.8.beta5

comment:28

At the time of importing the module, _get_display_manager() returns "The Sage display manager using the simple backend simple".

comment:29

Not sure what change broke this; perhaps a recent upgrade ipython.

I think we can just unconditionally lazy_import these functions

Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. This was a forced push. New commits:

5f08ad8trac 34547: use lazy imports in much of sage.interfaces.all
5cdfa88trac 34547: remove two "import sage.interfaces.all" lines
4f0dc51trac 34547: more lazy imports in interfaces/all.py

Changed commit from 9566546 to 4f0dc51

comment:31

Here is a branch that lazily imports all of these. The other option would be just to delete them.

comment:32

This is still conditionalizing these imports. I think we should just remove the test if _get_display_manager().is_in_terminal() and always lazy_import these bindings

Changed commit from 4f0dc51 to c1b3670

Branch pushed to git repo; I updated commit sha1. New commits:

c1b3670trac 34547: unconditionally lazily import *_console
comment:34

Let's try this.

comment:36

Thank you. The right place for these *_console imports is perhaps in sage.all_cmdline, but I can't get that to work.

comment:37

No, it works. Should I make that change?

comment:38

(Commands like gap_console do not work in the Jupyter notebook.)

comment:39

You may recall that lazy_import of non-function objects can lead to very unexpected outcomes. It may be the case that with these _console objects this is less problematic, because people will likely not be interested in its id or its hash (or its identity in general), but I think you need to have a good reason to use it anyway. Would much be lost if these _console objects were not available from interfaces?

By the way, note that the creation of the Octave() instance does not involve creating the process. That only happens once there's actually interaction with it. So it may well be that the cost of creating these console objects is actually pretty low. There may still be an argument to not create them by default, but I suspect the real reason for that is closer to a reason for not having these default instances in the first place.

comment:40

Maybe I’m misunderstanding, but aren’t these console objects actually functions? So it should be okay to lazily import them. (I didn’t check all of them, but all of the ones I checked are functions.)

comment:41

Apologies, indeed, the "_console" objects are just functions that, in the terminal, just starts a fresh process; as far as I can see simply as a "subshell": it just executes "os.system(cmd)". I think it's very rare current usage of sage would use that and it's easily replicated by other means, so I'd expect the "_console" routines can just go.

In fact, as it stands, they are not available in my global namespace! So somehow:

try:
    from sage.repl.rich_output.display_manager import get_display_manager as _get_display_manager
except ImportError:
    pass
else:
    if _get_display_manager().is_in_terminal():
        [BLOCK]

must have led to not executing [BLOCK], even though on the IPython command line is_in_terminal does return true and the import succeeds as well. Perhaps during initialization one of these things works differently?

My concern was for the interface objects themselves (the "magma" and "octave" instances): those are not just functions. They're actually instances of a quite complicated class.
They're obviously written to be quite light-weight (only initializing the expect interface if they're called) so you may want to do some timings to see if making them lazy actually saves anything measurable.

comment:42

Replying to John Palmieri:

The right place for these *_console imports is perhaps in sage.all_cmdline

I think that's a good solution.

I don't think we need to be concerned about use of these functions in user code that would try to import them from sage.interfaces.all.

Changed commit from c1b3670 to 70b7c44

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

70b7c44trac 34547: unconditionally lazily import *_console and move the
comment:44

I have not been able to do any meaningful timings. I tried running sage --startuptime multiple times, and at first I saw faster times with the changes on this ticket, but as I repeated the timings, the differences seemed to go away (and sage --startuptime reported slightly longer times the more I ran it).

comment:45

I also wouldn't object to just removing the top-level imports for the *_console functions. I don't remember seeing any complaints about their being broken.

comment:47

Build & Test shows failures:

File "sage/misc/session.pyx", line 12, in sage.misc.session
Failed example:
    show_identifiers()
Expected:
    ['w']
Got:
    ['axiom_console',
     'fricas_console',
     'gap3_console',
     'gap_console',
     'giac_console',
     'gnuplot_console',
     'gp_console',
     'kash_console',
     'lie_console',
     'macaulay2_console',
     'magma_console',
     'maple_console',
     'mathematica_console',
     'mathics_console',
     'matlab_console',
     'maxima_console',
     'mupad_console',
     'mwrank_console',
     'octave_console',
     'pkg',
     'qepcad_console',
     'r_console',
     'sage0_console',
     'singular_console',
     'w']
comment:48

One of the failures can be avoided by deleting the loop variable pkg at the end. For the others, should we change the doctests?

comment:49

Oh, I see, we can just move sage.misc.session.init() after the lazy imports.

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

09f5d92trac 34547: unconditionally lazily import *_console and move the

Changed commit from 70b7c44 to 09f5d92

comment:51

This works for me.

Changed commit from 09f5d92 to none

comment:54

Followup at #34927: a change to a doctest for __reduce__ in fricas.py along the same lines as above. That doctest was marked as optional, which is why I didn't catch it before.

comment:55

More issues: I have no idea why, but removing these lines from interfaces.all break the emacs sage-shell-mode:

-interfaces = ['gap', 'gap3', 'giac', 'gp', 'mathematica', 'gnuplot', \
-              'kash', 'magma', 'macaulay2', 'maple', 'maxima', \
-              'mathematica', 'mwrank', 'octave', 'r', \
-              'singular', 'sage0', 'sage']

If I restore this, or if I just add a line interfaces = [], then it seems to work again. I am not going to open a ticket for this right now, in part because I'm busy with other things and in part because I have no idea what's going on.

I thought that this might be related to repl.interface_magic, since that's a place that imports all from sage.interfaces.all, but I can't see why. However, the code in that file tries to register the various interfaces to create IPython magic commands like %%gap, and with the various new lazy imports, those interfaces are no longer registered. The now missing magic commands:

%%axiom
%%fricas
%%gap3
%%giac
%%kash
%%lie
%%lisp
%%macaulay2
%%maple
%%mathematica
%%mathics
%%matlab
%%mupad
%%mwrank
%%octave
%%r

Do we care? Is it worth restoring these?

comment:56

This line seems to refer to this global Sage variable -- https://github.com/sagemath/sage-shell-mode/blob/master/emacs_sage_shell.py#L55

comment:57

Thank you for finding that. I do enjoy reading uncommented code...

comment:58

This variable is not used anywhere in the Sage library, as far as I can tell. I see three options:

  1. restore it just as it was before
  2. restore it and set it to a sensible value, probably asking the sage-shell people what that means
  3. let the sage-shell people set it in their own code to a sensible value

1 is easiest. 2 makes more sense than 1, and 3 makes more sense than 2, at least to me. Unless we think that other people were also using this variable?

comment:59

To spell out the easy solution:

diff --git a/src/sage/interfaces/all.py b/src/sage/interfaces/all.py
index 92e19e3d38..53cf7c92dd 100644
--- a/src/sage/interfaces/all.py
+++ b/src/sage/interfaces/all.py
@@ -43,3 +43,9 @@ lazy_import('sage.interfaces.r', ['r', 'R', 'r_version'])
 lazy_import('sage.interfaces.read_data', 'read_data')
 lazy_import('sage.interfaces.scilab', 'scilab')
 lazy_import('sage.interfaces.tachyon', 'tachyon_rt')
+
+# The following variable is used by sage-shell-mode in emacs:
+interfaces = ['gap', 'gap3', 'giac', 'gp', 'mathematica', 'gnuplot', \
+              'kash', 'magma', 'macaulay2', 'maple', 'maxima', \
+              'mathematica', 'mwrank', 'octave', 'r', \
+              'singular', 'sage0', 'sage']
comment:60

Yes, I think that's a good solution, at least for now.
We don't really have protocols for documenting and deprecating global variables, so we should probably accept that downstream code makes use of this variable.

comment:61

Done in #34935.