google/go-jsonnet

Tarball downloads aren't stable for google/jsonnet

rockwotj opened this issue · 4 comments

It seems like a git hash is being depended on

CPP_JSONNET_GITHASH = "813c7412d1c7a42737724d011618d0fd7865bc69"

According to GitHub those hashes are subject to change: bazel-contrib/SIG-rules-authors#11

There should probably depend on a tag instead.

Currently our CI is failing due to this:

Error in download_and_extract: java.io.IOException: Error downloading [https://github.com/google/jsonnet/archive/813c7412d1c7a42737724d011618d0fd7865bc69.tar.gz] to /private/var/tmp/_bazel_disco/b88ec0eee6c2708271c6fcc3bb3a3afd/external/cpp_jsonnet/temp11367785044586142800/813c7412d1c7a42737724d011618d0fd7865bc69.tar.gz: Checksum was c723cc1b1ac0a4369c1a6d0a644c2c0cb1ce8d148b742451b6c2d708150ebb5f but wanted af7c9c102daab64de39fe9e479acc7389b8dd2d0647c2f9c6abc9c429070b0b8

Based on the quote below (although from the rest of the thread it seems like Github's stance on this is still not totally clear), it would be necessary for google/jsonnet to start manually uploading a source archive as part of the release process, as the automatically generated ones are cannot be relied upon to be stable, even for tags.

If you generate a release for a particular tag, and you upload your own assets, such as a tarball or binaries, we'll guarantee those don't change. However, the automated "Source code (tar.gz)" and "Source code (zip)" links, as well as any automated archives we generate, aren't guaranteed to be stable. That's because Git doesn't guarantee stability here and we rely on Git to generate those archives on the fly, so as we upgrade, things may change.

If you need a stable source code archive, please generate a release and upload your own archive as part of this process, and then you can reference those with stable hashes.

Source

Yeah it looks like they walked this back, but they want to prevent people from relying on those generated archives.

https://github.blog/changelog/2023-01-30-git-archive-checksums-may-change/

Really the git revision code ought to be good enough, as opposed to hashing the content of the tarball? I'm not sure what you can do in your CI system though?

At Google we can repeat any build from any time and get exactly the same bytes, but that requires having a consistent version of all the software used to produce the tarball, including the tar executable itself, the stdlib, and even the compiler. That's not necessarily an option available to everyone.