sha256 inconsistency between local hash and github actions
trhallam opened this issue · 6 comments
The sha256 hash on github actions did not match my local hash:
Local: Ubuntu 20.04 on WSL, hash created using openssl
Github actions: Python 3.9, ubuntu-latest
Data is being sourced from Github raw. The Github actions runner appears to create a different sha256 hash to my local implementation. Switching to md5 fixed this.
import os
import pooch
from . import __version__
GOODBOY = pooch.create(
path=os.curdir,
base_url="https://raw.githubusercontent.com/trhallam/digirock/main/tests/test_data/",
version=__version__,
# If this is a development version, get the data from the master branch
version_dev="main",
# The registry specifies the files that can be fetched from the local storage
registry={
"COMPLEX_PVT.inc": "3018c7ec33dded551e0bcd44103a1abd27ff4895268c712197616e396532da25",
"PVT_BO.inc": "053669c122948b690b03bcd2e5d11bdbc377bf84cddcd0d614ee19ec22ca36b6",
"PVT_RS.inc": "ff869731b2ece69fa0686b6a0204f113a0106e359413ddf1547841cbdf3d219d",
},
)
Data is located here: https://github.com/trhallam/digirock/tree/main/tests/test_data
Creates error:
SHA256 hash of downloaded file (COMPLEX_PVT.inc) does not match the known hash: expected 3018c7ec33dded551e0bcd44103a1abd27ff4895268c712197616e396532da25 but got 2bb908ad754ac1939a6d3cc34e3997c0c120424f27a028dcdbce288e338fe00e. Deleted download for safety. The downloaded file may have been corrupted or the known hash may be outdated.
@trhallam do you have a link to the Actions job that fails? That could help us figure out what's going on.
I ran openssl
here and get the same SHA256 as you. Strange that MD5 works for this when SHA doesn't.
One quick comment on the code above: You probably to use {version}
in the url instead of main
so that releases get data from the tag instead of the main branch. Otherwise changes to the data on main can break previous releases.
Hi @leouieda, I think this was the workflow -> https://github.com/trhallam/digirock/runs/5357449619?check_suite_focus=true
It is all green because the error is handled inside a notebook example during the build.
I don't have a release yet, hence things are not tied to a version but thanks for the tip, I'll try to update that when I create the first release.
Thanks, I'll have a look at the log to see if I can spot anything.
I don't have a release yet, hence things are not tied to a version but thanks for the tip, I'll try to update that when I create the first release.
You can do that now since pre-release version numbers from setuptools-scm will default to "main" since they're development releases.
@trhallam I had a look at the log and tried to see if I could somehow reproduce this behaviour but so far I got nothing. I thought it could be GitHub caching an old version of the data for some reason but then the MD5 would fail as well.
Have you tried using the https://github.com/trhallam/digirock/raw/main/ URL instead?
Looks like switching to this URL: https://github.com/trhallam/digirock/raw/{version}/ and adjusting my versioning scheme worked.
Perhaps the .raw
URL does some small changes to the byte code.
Thanks for your help.
Glad it worked! I have noticed that it does some caching at times when serving content. So changing things in the repo doesn't automatically update the raw
URL.