Can we have str.from_hex() function?
didip opened this issue · 8 comments
pretty much the opposite of:
function _M.to_hex(s)
local len = #s * 2
local buf = ffi_new(str_type, len)
C.ngx_hex_dump(buf, s, #s)
return ffi_str(buf, len)
end
@didip
That will be helpful.
However, in some circumstances you can use base64 instead......
@agentzh Work has been crazy recently, but yes, i have been slowly working on the patch.
Hello!
I've added function from_hex() to string.lua module.
But my original implementation was not stable.
Then I added other implementations and made little research.
So, I have 3 variants of from_hex(function):
- pure lua,
- FFI version based on strtol
- FFI version based on sscanf
And results:
- stable but slow (54.3s for 10M strings on my PC)
- fast but unstable (8.19s for 10M strings, ~20% invalid results)
- fast, less errors, but critical (core dumped, ~0.1%)
I added files to gist:
https://gist.github.com/realghost/1d99f6e80884831161713116dfe04d18
Please review it and test on your environment.
Are your results the same?
Please help me to find errors in FFI versions.
Thank you!
I've found error in strtol version:
I forgot to add byte for tmp
buffer. Because strtol expects zero-terminating strings, it continue to scan string if find non-zero 3rd byte.
Also I added optimization: dst
removed, results are written to src.
As for now we have performance: 6.9s on 10M strings.
Please review current strtol version and I've add it as PR.
For guys who is searching for a from_hex
implementation, I have written one in my lua-resty-base-encoding library.
Unlike those methods listed above, I write the feature in C and provide a thin Lua binding.
Consider Lua doesn't have real buffer, and the interaction with C land is expensive, especially when JIT is unavailable, I believe this approach could be fast.
It takes 0.93s for 10M strings (each string is 100 char length) in my local benchmark.
Consider it resolved.