Correctly fix checksums
vbraun opened this issue · 54 comments
Add functionality
sage --package download pari
sage --package fix-checksum pari
sage --package fix-checksum # fix all checksums
and deprecate sage-fix-pkg-checksums.
Also switch to standard argparse instead of reinventing the argument parsing wheel. And improve test coverage.
Component: build
Author: Volker Braun
Branch/Commit: b1c8a74
Reviewer: Jeroen Demeyer, Dima Pasechnik
Issue created by migration from https://trac.sagemath.org/ticket/19984
Branch: u/vbraun/correctly_fix_checksums
Description changed:
---
+++
@@ -1 +1,8 @@
+Add functionality
+```
+sage --package download pari
+sage --package fix-checksum pari
+sage --package fix-checksum # fix all checksums
+```
+Also switch to standard argparse instead of reinventing the argument parsing wheel. And improve test coverage.Author: Volker Braun
Description changed:
---
+++
@@ -5,4 +5,6 @@
sage --package fix-checksum pari
sage --package fix-checksum # fix all checksums
```
+and deprecate `sage-fix-pkg-checksums`.
+
Also switch to standard argparse instead of reinventing the argument parsing wheel. And improve test coverage.Branch pushed to git repo; I updated commit sha1. New commits:
443cf39 | Correct helpstring |
Branch pushed to git repo; I updated commit sha1. New commits:
8fe1549 | Also test sage --package update |
wow, argparse.py isn't a small file...
Making Sage 0.1% bigger vs. writing and maintaining your own parser is an easy choice...
it would be good to know whether it is just a pip call away.
It is, but not before we install pip which we download using sage_bootstrap
Is there a pressing need to have arg parsing before we have pip available?
I also don't get why we need to have in in sage, as it's standard Python:
$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import argparse
>>>
(and the same with $ sage --python)
- Scripts need to use things like
sage-download-file --print-fastest-mirror - Argparse is new in Python 2.7, but we want to support 2.6+. See also the tox.ini file.
Hmm, I presume one doesn't need to fix checksums before Sage's python is up and running.
I'd rather just rely on built-in argparse of Sage's python.
Or rather rely on system's python having argparse. I cannot imagine people on Python 2.6 who must build Sage from source, but cannot update Python, or at least install argparse.
sage-download-file is used for downloading packages, which we must do in order to install Python in the first place.
The fixing checksums is just an extra at this point, but of course its almost the same code as checking the checksum.
Btw pip itself bundles about 10 required packages because it can't be used to install its own dependencies either.
PS: OSX had Python 2.6 up until the last version, and good luck installing argparse globally without root.
Assuming you deprecate that, I think you'll also have to change http://doc.sagemath.org/html/en/developer/packaging.html#checksums and any other related documentation.
Replying to @vbraun:
PS: OSX had Python 2.6 up until the last version, and good luck installing argparse globally without root.
I am not worried about OSX users, who of them don't have root?
This should work (it used to work with sage-fix-pkg-checksums):
$ rm build/pkgs/cliquer/checksums.ini
$ sage --package fix-checksum cliquer
Traceback (most recent call last):
File "/usr/local/src/sage-git/build/bin/sage-package", line 42, in <module>
run()
File "/usr/local/src/sage-git/build/bin/../sage_bootstrap/cmdline.py", line 241, in run
app.fix_checksum(args.package_name)
File "/usr/local/src/sage-git/build/bin/../sage_bootstrap/app.py", line 157, in fix_checksum
pkg = update.package
File "/usr/local/src/sage-git/build/bin/../sage_bootstrap/updater.py", line 37, in package
self.__package = Package(self.package_name)
File "/usr/local/src/sage-git/build/bin/../sage_bootstrap/package.py", line 46, in __init__
self._init_checksum()
File "/usr/local/src/sage-git/build/bin/../sage_bootstrap/package.py", line 203, in _init_checksum
with open(checksums_ini, 'rt') as f:
IOError: [Errno 2] No such file or directory: '/usr/local/src/sage-git/build/pkgs/cliquer/checksums.ini'
Even this does not work:
$ ./sage --fix-pkg-checksums upstream/cliquer-1.21.tar.gz
Deprecated #19984, use sage --package fix-checksum instead
ERROR [tarball|__init__:62]: tarball cliquer-1.21.tar.gz is not referenced by any Sage package
Traceback (most recent call last):
File "/usr/local/src/sage-git/build/bin/sage-package", line 42, in <module>
run()
File "/usr/local/src/sage-git/build/bin/../sage_bootstrap/cmdline.py", line 228, in run
app.name(args.tarball_filename)
File "/usr/local/src/sage-git/build/bin/../sage_bootstrap/app.py", line 68, in name
tarball = Tarball(os.path.basename(tarball_filename))
File "/usr/local/src/sage-git/build/bin/../sage_bootstrap/tarball.py", line 63, in __init__
raise ValueError(error)
ValueError: tarball cliquer-1.21.tar.gz is not referenced by any Sage package
You need a way to tie tarballs to packages even if checksums.ini is not present or contains bogus information.
It is a feature that you need checksums.ini where the tarball name pattern is specified (the checksums are not needed). This is the only place that unambiguously ties together tarball file names with package names. You can have other heuristics but they can and will break, and you can't build a reliable system on top of sand? Besides, extra filename-based heuristics violate the single source of truth principle.
That $ ./sage --fix-pkg-checksums upstream/cliquer-1.21.tar.gz demonstrates the advantage of this ticket, the new tarball name is cliquer-1.21.tar.bz2 and if you accidentally enter the wrong extension you now get an error message instead of silently using the wrong tarball.
Replying to @dimpase:
I am not worried about OSX users, who of them don't have root?
I know a ton of people in the Oxford Math building that use OSX and do not have root access, for example.
Test comment, ignore
Replying to @vbraun:
Replying to @dimpase:
I am not worried about OSX users, who of them don't have root?
I know a ton of people in the Oxford Math building that use OSX and do not have root access, for example.
such people are already on OSX 10.11, I suppose, as no sysadmin would allow an un-updated host on their network...
(posted using geom as socks_v5 proxy...)
Considering that OSX tends to break stuff every version and that the Math network is heavily firewalled (not even ssh from the outside) I'm not so sure about that.
In any case, this ticket isn't about removing Python 2.6 support when building Sage. I don't really mind that, but it should be a separate ticket.
how are we going to keep track of the stuff that should be removed once the support for Python 2.6 is removed?
I still think that [comment:18] would be an annoying regression. Adding a new package should be as simple as possible. Requiring to manually add checksums.ini is an extra burden which didn't exist before.
Its even more annoying to change existing checksums.ini to different tarballs because of some unwritten nameing rule with .bz2 vs .gz. Not to mention being totally broken for some packages, e.g. python2.
It would be nice to have an easy way to create a new package, but it shouldn't be mashed into sage-fix-package-checksums. E.g.
sage --package create foo --url=http://path/to/foo.tar.gz
could create the package and populate it with template spkg-install etc.
perhaps more pressing issue is that the current assumptions about package archives namings are too rigid. E.g. Normaliz 3.1.0 has archive named
https://www.normaliz.uni-osnabrueck.de/wp-content/uploads/2016/02/Normaliz3.1.0.zip
which unpacks into Normaliz3.1 directory.
I do not know how to handled this without repackaging.
Another example is nauty 2.6, with archives named nauty26...
Agreed, the totally unnecessary naming restrictions are a pain. But the name handling of old-style spkgs is mashed into the new-style name handling code. And there is no testsuite apart from building all packages. So its hard to improve.
What is preventing this to go forward? I don't know - I don't understand the Jeroen's problem (or don't understand it any more).
Jeroen's problem is that this ticket adds an extra step to add a new package. Currently, checksums.ini is completely auto-generated: one does not need to worry at all about checksums.ini when creating a new package. This is no longer true with this branch because it requires that the developer adds a minimal checksums.ini.
On the plus side it actually works whereas the current script does not (e.g. try updating the python2 package #19735)
What exactly should go there in checksums.ini ? Some meta-information? Is it documented in this ticket?
It just needs to contain
tarball=Python-VERSION.tgz
to define (together with package-version.txt) the tarball to use.
So you can use tarballs named not according to Sage "standard" with this ticket?
This would be great, and this'd beat the slightly unfortunate fact that you need one more file.
I'm pretty sure there is more string matching scattered in various scripts to map between package names and tarball filenames, this should be replaced by the single point of truth
$ ./build/bin/sage-package tarball python3
Python-3.4.3.tar.gz
However this is future work and not what this package is about.
As soon as there is something in the developer docs that reflects this change, I'd be happy to give it a positive review.
Branch pushed to git repo; I updated commit sha1. New commits:
270a002 | Add a sage --package create subcommand to initialize/overwrite package data |
Branch pushed to git repo; I updated commit sha1. New commits:
b1c8a74 | Documentation checksums.ini |
Also adds sage --package create foo --version 1.3 --tarball FoO-VERSION.tar.gz --type experimental
$ sage --package fix-checksum *
usage: sage --package [-h] [--log LOG]
{fix-checksum,name,create,list,update,tarball,apropos,download,config}
...
sage --package: error: unrecognized arguments: aclocal.m4 autom4te.cache bootstrap build config config.log config.status configure configure.ac COPYING.txt local logs m4 Makefile README.md sage src upstream VERSION.txt
the logic behind the latter error message looks as if it parses the current directory, for some reason... It is a feature or a bug?
Bash expands * to all files in the current directory. Usage is:
$ ./sage --package fix-checksum -h
usage: sage --package fix-checksum [-h] [package_name]
positional arguments:
package_name Package name. Default: fix all packages.
optional arguments:
-h, --help show this help message and exit
Fix the checksum of a package
EXAMPLE:
$ sage --package fix-checksum pari
Updating checksum of pari-2.8-2044-g89b0f1e.tar.gz
My worry is not checking arguments for correctness might lead to screwups.
$ sage --package fix-checksum *
Checksum of pari-2.8-2341-g61b65cc.tar.gz unchanged
$ ls -l
total 0
-rw-rw-r-- 1 dima dima 0 Jun 13 07:27 pari
OK, this particular case is largely innocent, but still. I won't mind if this is addressed on another ticket.
Reviewer: Jeroen Demeyer, Dima Pasechnik
Changed branch from u/vbraun/correctly_fix_checksums to b1c8a74