After #30053, sphinx 3.1.2 does not build on ubuntu-{trusty,xenial,bionic}, debian-jessie, centos-7 (again)
Closed this issue · 58 comments
sage-system-python is the Python that we use to download and unpack upstream archives. We allow a very wide range of Python versions.
After #30053 undid the change in #29033, we are again running into locale / encoding problems with early (pre-3.7) versions of Python used as sage-system-python.
This is a blocker ticket for Sage 9.2 because it is a severe regression in platform support compared to Sage 9.1.
Originally reported on this ticket:
/sage/build/bin/sage-spkg: line 73: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8): No such file or directory
/sage/build/bin/sage-spkg: line 73: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
Found local metadata for sphinx-3.0.4.p0
bash: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
Attempting to download package Sphinx-3.0.4.tar.gz from mirrors
http://mirrors.xmission.com/sage/spkg/upstream/sphinx/Sphinx-3.0.4.tar.gz
[......................................................................]
sphinx-3.0.4.p0
====================================================
Setting up build directory for sphinx-3.0.4.p0
bash: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
Traceback (most recent call last):
File "/sage/build/bin/sage-uncompress-spkg", line 23, in <module>
run()
File "/sage/build/bin/../sage_bootstrap/uncompress/cmdline.py", line 72, in run
unpack_archive(archive, dirname)
File "/sage/build/bin/../sage_bootstrap/uncompress/action.py", line 68, in unpack_archive
archive.extractall(members=archive.names)
File "/sage/build/bin/../sage_bootstrap/uncompress/tar_file.py", line 96, in extractall
**kwargs)
File "/usr/lib64/python3.6/tarfile.py", line 2010, in extractall
numeric_owner=numeric_owner)
File "/usr/lib64/python3.6/tarfile.py", line 2052, in extract
numeric_owner=numeric_owner)
File "/sage/build/bin/../sage_bootstrap/uncompress/tar_file.py", line 122, in _extract_member
**kwargs)
File "/usr/lib64/python3.6/tarfile.py", line 2122, in _extract_member
self.makefile(tarinfo, targetpath)
File "/usr/lib64/python3.6/tarfile.py", line 2163, in makefile
with bltn_open(targetpath, "wb") as target:
UnicodeEncodeError: 'ascii' codec can't encode character '\xe4' in position 45: ordinal not in range(128)
************************************************************************
Error: failed to extract /sage/upstream/Sphinx-3.0.4.tar.gz
************************************************************************
Please email sage-devel (http://groups.google.com/group/sage-devel)
explaining the problem and including the log file
/sage/logs/pkgs/sphinx-3.0.4.p0.log
Describe your computer, operating system, etc.
************************************************************************
See e.g. https://github.com/sagemath/sage/runs/811777719?check_suite_focus=true
But now (https://github.com/mkoeppe/sage/runs/1141520147) we see it for ubuntu-trusty-minimal (where sage-system-python is python 3.4).
We fix this problem by making locale fixes in the sage-system-python script. See e.g. https://github.com/mkoeppe/sage/runs/1142310012 for a run with the present ticket.
Depends on #30053
CC: @vbraun @dimpase @orlitzky @jhpalmieri @seblabbe
Component: packages: standard
Author: Matthias Koeppe
Branch/Commit: ff0dbc6
Reviewer: Jonathan Kliem, Sébastien Labbé
Issue created by migration from https://trac.sagemath.org/ticket/30008
That's a very strange error
No, doesn't help. At least not as it is now.
According to this
https://www.noreplied.com/how-to-fixed-cannot-change-locale-utf-8-error-in-centos-7/
one should add LC_CTYPE="en_US.UTF-8" to /etc/environment. Very unsatisfying fix for someone without root privileges.
This appears to be a bug with centos 7, which is merely exposed now.
Just exporting this environment variable setting in build/bin/sage-spkg should be enough. No need for a systemwide setting
Ok. Testing this now. https://github.com/kliem/sage/pull/19/checks
It appears that things fixed itself: https://github.com/sagemath/sage/actions/runs/166309975
Reviewer: Matthias Koeppe
Also the upgrade to Sphinx 3.1.2 is complete at #30001 and was merged in Sage 9.2.beta5.
Description changed:
---
+++
@@ -1,3 +1,8 @@
+`sage-system-python` is the Python that we use to download and unpack upstream archives. We allow a very wide range of Python versions.
+
+After #30053 undid the change in #29033, we are again running into locale / encoding problems with early (pre-3.7) versions of Python used as `sage-system-python`.
+
+Originally reported on this ticket:
```
/sage/build/bin/sage-spkg: line 73: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8): No such file or directory
@@ -42,3 +47,11 @@
```
See e.g. https://github.com/sagemath/sage/runs/811777719?check_suite_focus=true
+
+But now (https://github.com/mkoeppe/sage/runs/1141520147) we see it for `ubuntu-trusty-minimal` (where `sage-system-python` is python 3.4).
+
+We fix this problem by making locale fixes in the `sage-system-python` script.
+
+
+
+Changed reviewer from Matthias Koeppe to https://github.com/mkoeppe/sage/actions/runs/264540655
Author: Matthias Koeppe
the error in the ticket description comes from Python 3.6. Is this outdated, in view of #30053, as the locale is staying C locale?
What is the commit in comment:20 doing? I am lost.
Replying to @dimpase:
the error in the ticket description comes from Python 3.6. Is this outdated, in view of #30053, as the locale is staying
Clocale?
build/bin/sage-system-python finds the Python that we use for downloading and unpacking packages.
It is unrelated to the python that we try to find in build/pkgs/python3/spkg-configure.m4 (which rejects Python 3.6 after #30053).
build/bin/sage-system-python can use any of the Pythons listed in build/tox.ini. In particular, it can be an early 3.x series Python. For these, LC_ALL=C disables UTF-8 operation, which causes the errors described on this ticket.
Look at the GH Actions links in the ticket description to see it in action.
Description changed:
---
+++
@@ -50,8 +50,7 @@
But now (https://github.com/mkoeppe/sage/runs/1141520147) we see it for `ubuntu-trusty-minimal` (where `sage-system-python` is python 3.4).
-We fix this problem by making locale fixes in the `sage-system-python` script.
+We fix this problem by making locale fixes in the `sage-system-python` script. See e.g. https://github.com/mkoeppe/sage/runs/1142310012 for a run with the present ticket.
-Ready for review
Replying to @seblabbe:
On Ubuntu 18.04 + beta12 + #30053 (+ #30606, the other branch I am working on now), I do not confirm this issue for building sphinx:
sage -f sphinxworks correctly. version 3.1.2.p0 to be more precise.
You probably have a better version of python in the front of the path than what is the default on Ubuntu 18.04. You can check this with build/bin/sage-system-python --version.
Why again are we trying to extract a tarball with python?
The problem here isn't really the C locale, it's that Sphinx ships files with non-ascii characters in them...
$ ls tests/roots/test-images/ | tail -n 1
-rw-r--r-- 1 mjo mjo 65K 2020-08-13 11:46 testimäge.png
and we don't specify an encoding, which ultimately leaves it up to the locale.
Ignoring the fact that reimplementing this is absurd, we should be able to set the filename encoding to utf8 here without touching the locale at all.
Replying to @orlitzky:
Why again are we trying to extract a tarball with python?
This question is not within the scope of this ticket
Replying to @orlitzky:
The problem here isn't really the
Clocale
Yes, it is specifically the problem with setting the C locale explicitly. This is explained well in https://www.python.org/dev/peps/pep-0538/
Replying to @orlitzky:
we should be able to set the filename encoding to utf8 here without touching the locale at all.
We are using the tarfile module as supplied by the system python.
Changed reviewer from https://github.com/mkoeppe/sage/actions/runs/264540655 to https://github.com/mkoeppe/sage/actions/runs/264540655, ...
Description changed:
---
+++
@@ -1,6 +1,8 @@
`sage-system-python` is the Python that we use to download and unpack upstream archives. We allow a very wide range of Python versions.
After #30053 undid the change in #29033, we are again running into locale / encoding problems with early (pre-3.7) versions of Python used as `sage-system-python`.
+
+This is a blocker ticket for Sage 9.2 because it is a severe regression in platform support compared to Sage 9.1.
Originally reported on this ticket:
I'm not willing to install a version of python from 2014 to test it, but setting tarfile.ENCODING = 'utf-8' probably solves this in a much better way, since we know what the encoding is.
Replying to @orlitzky:
I'm not willing to install a version of python from 2014 to test it
You don't have to. We have the GH Actions to test it. The link with the runs and results is in the "Reviewer field".
Replying to @orlitzky:
setting
tarfile.ENCODING = 'utf-8'probably solves this in a much better way, since we know what the encoding is.
No, that is absolutely not the correct fix because the default (obtained using sys.getfilesystemencoding() or sys.getdefaultencoding()) is correct. See https://docs.python.org/2.6/library/tarfile.html#tar-unicode (Python 2.6 is the earliest version that we support for sage-system-python).
The specific problem is that if someone puts a LC_ALL=C, the default encodings are set to ASCII. This is what causes the breakage, and this is what the current ticket fixes.
Needs review.
Setting the locale to C causes the default guess to be wrong, but we know the correct encoding. And the utf-8 encoding is available "everywhere," unlike the locales that we're guessing at in your branch. Finally, changing the system locale to affect one tarfile is like setting your house on fire to stop the faucet from leaking.
Replying to @orlitzky:
we know the correct encoding.
This is of course a good point. We can just decide that our tarballs have utf-8 pathnames. I will test the approach with setting tarfile.ENCODING.
Nevertheless, removing the C locale setting in sage-system-python would still be good because there could be other parts that are affected by this unsuitable setting.
And #30053 just put things back the way they were before we tried to switch to C.UTF-8. We need to set some locale, because otherwise the behavior of standard utilities in spkg-install and spkg-check scripts is unknowable. For example, this branch can choose the en_US.utf8 locale. How do grep ranges work there? According to https://www.gnu.org/software/grep/manual/html_node/Character-Classes-and-Bracket-Expressions.html,
In the default C locale, the sorting sequence is the native character order; for example, ‘[a-d]’ is equivalent to ‘[abcd]’. In other locales, the sorting sequence is not specified, and ‘[a-d]’ might be equivalent to ‘[abcd]’ or to ‘[aBbCcDd]’, or it might fail to match any character, or the set of characters that it matches might even be erratic. To obtain the traditional interpretation of bracket expressions, you can use the ‘C’ locale by setting the LC_ALL environment variable to the value ‘C’.
Mucking with this is bad luck.
Replying to @orlitzky:
And #30053 just put things back the way they were before we tried to switch to
C.UTF-8.
Yes, it was a good change for that ticket, which is why I agreed with the change there.
But it affects the Pythons that run below it.
(1) the sage-system-python - which the present ticket addresses
(2) the python3 from which we build our venv -- which is yet to be addressed in #30576 if we want to support python 3.6.
Branch pushed to git repo; I updated commit sha1. New commits:
ff0dbc6 | build/sage_bootstrap/uncompress/tar_file.py: Fix encoding to utf-8 |
Changed reviewer from https://github.com/mkoeppe/sage/actions/runs/264540655, ... to https://github.com/mkoeppe/sage/actions/runs/265853210, ...
Changed reviewer from https://github.com/mkoeppe/sage/actions/runs/265853210, ... to https://github.com/mkoeppe/sage/actions/runs/265872338, ...
Needs review.
Replying to @mkoeppe:
You probably have a better version of python in the front of the path than what is the default on Ubuntu 18.04. You can check this with
build/bin/sage-system-python --version.
I have:
$ build/bin/sage-system-python --version
Python 2.7.17
I don't know how to properly review this ticket. With ubuntu-bionic 18.04, I do not have problem. So what I can do is to check that with the branch, I still don't have a problem. Should I do some kind of make distclean? For now, I just pulled that branch on top of freshly built beta13, and I am currently running make ptestlong. Will report later.
The tests have already run on GH Actions. It is not necessary to run them again to review this ticket.
The problem shows on machines where sage-system-python is python 3.x with x < 7.
It's probably not a good idea to try to uninstall python2 to review this ticket, but you could try to shadow it
Replying to @mkoeppe:
you could try to shadow it
tell me how to shadow, I will try that tomorrow. How can I make sage-system-python to use:
$ which python3.6
/usr/bin/python3.6
?
mkdir /somewhere && ln -s /usr/bin/python3.6 /somewhere/python
export PATH=/somewhere:$PATH
Doesn't the tarball commit alone fix issue without the locale voodoo?
It does fix the issue, but it is best not to pass a locale to python 3.6 that we know to break things.
Seems to work fine and looks good to me.
Changed reviewer from https://github.com/mkoeppe/sage/actions/runs/265872338, ... to Jonathan Kliem
Replying to @mkoeppe:
mkdir /somewhere && ln -s /usr/bin/python3.6 /somewhere/python export PATH=/somewhere:$PATH
Indeed, the above shadowing method works:
$ build/bin/sage-system-python --version
Python 3.6.9
I can now reproduce the issue with Ubuntu 18.04 + 9.2.beta13 and I confirm that the current branch fixes it.
Changed reviewer from Jonathan Kliem to Jonathan Kliem, Sébastien Labbé
Thanks everyone!