Nondeterministic string hashing in Python(>3.3)
Cjen1 opened this issue · 1 comments
Cjen1 commented
I was running into some weird issues with incorrect caching to file a function applied to a string.
This is because python(>3.3) salts its hashing function. (for strings at least)
Specifically:
> python -c "print(hash('asdf'))"
-8690208562067163084
> python -c "print(hash('asdf'))"
-4220296486527231708
The fix for this is to pass in PYTHONHASHSEED=1
.
The 'proper' fix would be to substitute the internal hash function for something more suitable, however I couldn't immediately see the right place to inject that.
PYTHONHASHSEED=1 python -c "print(hash('asdf'))"
-5132432945605986887
PYTHONHASHSEED=1 python -c "print(hash('asdf'))"
-5132432945605986887
tillahoffmann commented
Thanks for reporting. This should only be a problem if results are cached across different processes. Do you have a reproducible code snippet to illustrate the issue?