Hash of compute packages changes without changes to the contents
jdno opened this issue · 10 comments
Version
$ fastly version
Fastly CLI version v4.5.0 (c56025c)
Built with go version go1.18.9 linux/amd64
What happened
Hi there,
Running fastly compute build
changes the hash of the package, even though the code hasn't changed.
$ fastly compute build &>/dev/null && sha512sum pkg/compute-static.tar.gz
42189c8caad903cada550d4aeffe6b29f769ef3c1320801f137a01967829cc0f9b300fe6174872bdcbc3c844ecb6141fb9fcd6426eb4e9e85d04edf014b68d72 pkg/compute-static.tar.gz
$ fastly compute build &>/dev/null && sha512sum pkg/compute-static.tar.gz
e61c379cda3b1dcdb3824d7a110aa2369cf5fe2e8d3aff2df0046b2caa23ba6e648700d793863dd42e304bf768ded231596f47c536a8c5528df4851c576768d3 pkg/compute-static.tar.gz
The files included in the archive are unchanged and have the same checksum both times.
As far as I can tell, this is caused by gzip
putting metadata into the archive. Which can be disabled with different flags, but I have no idea how that works with the Go library. 😬 Reference: https://stackoverflow.com/questions/36464358/why-do-the-md5-hashes-of-two-tarballs-of-the-same-file-differ
This is particularly problematic for us, since we are using the Terraform provider to manage the Compute service. Every time the hash "has changed", Terraform will upload the package again and create a new version for the service.
There are two reasons why I am concerned about this:
- It makes it impossible to catch instances where the package was accidentally changed, since the assumption right now is that a changed hash it just gzip's fault.
- It creates a lot of noise on Fastly, where the audit logs show updated packages for pretty much every version.
Would it be possible to create a stable hash based on the package contents?
Thanks,
JD
Hi @jdno
Thanks for opening this issue.
re:
Would it be possible to create a stable hash based on the package contents?
We've been looking at changes to our API to better support generating a hash from the content of the package (as a way to avoid this issue) but unfortunately, the blast radius of that change meant we had quite a few hoops to jump through.
We're still figuring out the best way to solve this issue, but we are aware that this is a problem for customers and are actively looking to resolve it as soon as possible.
Thanks.
Thanks for the update! If there is anything that I can do to help, just let me know. 🙂
What is the status of this issue?
For everybody else running in this problem you can repack your TAR file with the following script:
#!/bin/bash
# Script to repack a tar.gz file with a fixed owner, group and timestamp
# Solves https://github.com/fastly/cli/issues/743
cd pkg/ || exit 1
folder="$1"
tarfile="${folder}.tar.gz"
# Extract file
tar -xvf "$tarfile" 1> /dev/null
# tar all files again but with a fixed owner, group and timestamp. See https://stackoverflow.com/a/54908072
tar --sort=name --owner=root:0 --group=root:0 --mtime='UTC 2019-01-01' -czf "$tarfile" "$folder" 1> /dev/null
# Remove folder which was created during extraction
rm -rf "$folder"
Just provide the name
from fastly.toml
to the script.
So this particular issue ended up becoming a much larger piece of work (internally) to address.
The Fastly API (as of last week) now exposes a new property called files_hash
:
https://developer.fastly.com/reference/api/services/package/#metadata-model
Hash of the files within the Compute@Edge package.
I'm just wrapping up another piece of work and then I'm going to be integrating this new property into the Fastly Terraform provider (so it replaces the use of the hashsum
property).
The difference is that files_hash
is a hash of all the files within the package (in sorted order) while hashsum
was a hash of the package.tar.gz
itself.
I appreciate your patience on this. There were a surprising number of moving pieces 'behind the scenes' that led to the delay in getting to this point in time, but hopefully I'll have a new release out soon that finally addresses this issue.
Thanks for the explanation.
I'm just wrapping up another piece of work and then I'm going to be integrating this new property into the Fastly Terraform provider (so it replaces the use of the hashsum property).
Can you link to the related issue in the terraform fastly provider github repo or in which version this change will land?
I've created this issue so you can follow notifications:
fastly/terraform-provider-fastly#697
I've just merged fastly/terraform-provider-fastly#698 and intend on publishing a new major release for the Fastly Terraform provider, so in the mean time I'm going to close this issue.
Thanks! Works as expected.
Great to hear! Thanks for the update @mlegenhausen 👍🏻