JuliaString/MurmurHash3.jl

Usage example

roshii opened this issue · 6 comments

Hi,

Would you please provide an example or two demonstrating how to get the hash of a string ?

See here after how I did proceed.

julia> str = "hello"
"hello"

julia> mmhash32(sizeof(str), pointer(str), 0%UInt32)
ERROR: MethodError: no method matching fmix(::Int64)
Closest candidates are:
  fmix(::UInt32) at /home/simon/Projects/MurmurHash3.jl/src/MurmurHash3.jl:258
  fmix(::UInt64) at /home/simon/Projects/MurmurHash3.jl/src/MurmurHash3.jl:31
Stacktrace:
 [1] mmhash32(::Int64, ::Ptr{UInt8}, ::UInt32) at /home/simon/Projects/MurmurHash3.jl/src/MurmurHash3.jl:273
 [2] top-level scope at none:0

Thanks upfront!

Changing mhblock to @inline (h1, k1) = rotl13(xor(h1, rotl15(k1 * d1) * d2)) * 0x00000005 + 0xe6546b64 solves the above issue but doesn't deliver expected result for MurmurHash3 32-bit unsigned

Ah, sorry. Just saw this. What is versioninfo() on the platform you are running on?
I hadn't ever tested using MurmurHash3 outside of the Strs.jl (or StrBase.jl) package(s).
If you want hash of a string using one of the string types in the Strs package, just call hash.

Just use the hash function from Julia Base to get the hash value of any string.
There is documentation at the REPL.

Thanks for the above.

Just use the hash function from Julia Base to get the hash value of any string.

Unfortunately, Base.hash implements the 128 bit version of MurmurHash3 whereas I do need its 32-bit version.

Meanwhile, I did manage to modify Murmur3.jl package to get the expected result for a 32-bit hash - similar to SMHasher reference implementation. See different hashing results here after.

julia> str = "Hello World"
"Hello World"

julia> hash(str)
0x9c427f8f448ad93f
julia> mmhash32(sizeof(str), pointer(str), 0%UInt32)
0xf71c2877

julia> Int(ans)
4145817719
julia> Murmur3.hash32(str)
427197390

As you can see, Base.hash returns a 64-bit hash and your implementation does return a 32-bit hash but not what it should be...

Either I do not call mmhash32 correctly, either there is something wrong with it. I will try using Str type to make it work but regardless, it would be nice if MurmurHash3.jl could handle String as well :)

For the record:

julia> versioninfo()
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = atom  -a
  JULIA_NUM_THREADS = 2

See here after result of mmhash32 using Str type.
It has exact same behavior as with String type.

julia> str = StrBase.Str("Hello World")
"Hello World"

julia> MurmurHash3.mmhash32(sizeof(str), pointer(str), 0%UInt32)
ERROR: MethodError: no method matching rotl(::Int64, ::Int64)
Closest candidates are:
  rotl(::Unsigned, ::Any) at /home/simon/.julia/packages/MurmurHash3/6widu/src/MurmurHash3.jl:20
Stacktrace:
 [1] mhblock at /home/simon/.julia/packages/MurmurHash3/6widu/src/MurmurHash3.jl:247 [inlined]
 [2] mhbody at /home/simon/.julia/packages/MurmurHash3/6widu/src/MurmurHash3.jl:264 [inlined]
 [3] mmhash32(::Int64, ::Ptr{UInt8}, ::UInt32) at /home/simon/.julia/packages/MurmurHash3/6widu/src/MurmurHash3.jl:271
 [4] top-level scope at none:0

OK, I hadn't spent much time on mmhash32, since I was only using it for 32-bit platforms, and didn't test it on 64-bit platforms.
I'll take a look, as well as look at adding support calling it with a String type.