multihahs.is_valid returns false unexpectedly
Closed this issue · 4 comments
- py-multihash version: latest from pypi (
pip install py-multihash
), I think: see #4 - Python version: Python 3.6.7 (default, Oct 22 2018, 11:32:17) [GCC 8.2.0]
- Operating System: linux
Description
I think mh = b'122041dd7b6443542e75701aa98a0c235951a28a0d851b11564d20022ab11d2589a8'
should be a valid multihash, but multihash.is_valid(mh)
returns otherwise.
(am I wrong?)
What I Did
Wrote two tests that fail unexpectedly...
import multihash
import base58
def test_is_vald_multihash():
# from https://multiformats.io/multihash/#examples
# e.g. blake 2s, 128 bits: b'b250100a4ec6f1629e49262d7093e2f82a3278'
# sha2-256, 32 bits
mh = b'122041dd7b6443542e75701aa98a0c235951a28a0d851b11564d20022ab11d2589a8' # noqa
assert multihash.is_valid(mh) # surprising!
def test_is_vald_multihash_from_b58():
mh = b'122041dd7b6443542e75701aa98a0c235951a28a0d851b11564d20022ab11d2589a8' # noqa
b58_enc_mh = base58.b58encode(mh)
# convert the b58 encoded multihash from bytes into string
str_b58_enc_mh = "".join(chr(x) for x in b58_enc_mh)
assert multihash.is_valid(
multihash.from_b58_string(str_b58_enc_mh))
# well, that explains why my code isn't working...
# hacking the tests above...
> raise Exception(multihash.decode(mh)) # DEBUG
tests/domain/test_URI.py:129:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
multihash = b'122041dd7b6443542e75701aa98a0c235951a28a0d851b11564d20022ab11d2589a8'
def decode(multihash):
"""
Decode a hash from the given multihash
:param bytes multihash: multihash
:return: decoded :py:class:`multihash.Multihash` object
:rtype: :py:class:`multihash.Multihash`
:raises TypeError: if `multihash` is not of type `bytes`
:raises ValueError: if the length of multihash is less than 3 characters
:raises ValueError: if the code is invalid
:raises ValueError: if the length is invalid
:raises ValueError: if the length is not same as the digest
"""
if not isinstance(multihash, bytes):
raise TypeError('multihash should be bytes, not {}', type(multihash))
if len(multihash) < 3:
raise ValueError('multihash must be greater than 3 bytes.')
buffer = BytesIO(multihash)
try:
code = varint.decode_stream(buffer)
except TypeError:
raise ValueError('Invalid varint provided')
if not is_valid_code(code):
> raise ValueError('Unsupported hash code {}'.format(code))
E ValueError: Unsupported hash code 49
But it's not hashcode 49, it's hashcode 18 (0x12 == sha2-256).
Hello @monkeypants,
This hexadecimal string:
122041dd7b6443542e75701aa98a0c235951a28a0d851b11564d20022ab11d2589a8
is indeed a valid multihash (with hashcode 18, sha2-256). What's wrong is the way you're passing it to the multihash.decode() function, using b''
What you need to do is convert the hex string into bytes by using bytes.fromhex()
bytes.fromhex('122041dd7b6443542e75701aa98a0c235951a28a0d851b11564d20022ab11d2589a8')
b'\x12 A\xdd{dCT.up\x1a\xa9\x8a\x0c#YQ\xa2\x8a\r\x85\x1b\x11VM \x02*\xb1\x1d%\x89\xa8'
And then of course py-multihash has no issue decoding that
multihash.decode(bytes.fromhex('122041dd7b6443542e75701aa98a0c235951a28a0d851b11564d20022ab11d2589a8'))
Multihash(code=18, name='sha2-256', length=32, digest=b'A\xdd{dCT.up\x1a\xa9\x8a\x0c#YQ\xa2\x8a\r\x85\x1b\x11VM \x02*\xb1\x1d%\x89\xa8')
Take care.
There's actually a function called multihash.from_hex_string() which does exactly the same.
thanks!