ekpyron/xxhashct

[info] i am trying to port this libraries to arduino

Opened this issue · 7 comments

i was looking for a light and fast hash function for arduino, also present in php to make interact them

  • i checked crc16 (native in avr-libc) but it is not so 'entropic'
  • i know there are md5, sha1 and other hash libs for arduino but they are huuuge
  • another candidate i saw were djb2 hash, but until simpler is slow and has no native php support
  • others i saw were Paul Hsieh's SuperFastHash and Eric Ztamnl Fast-Hash, but have larger code and no port to other langs

until maybe not perfect i think this lib is the best option we have (short code, efficient, ported), i think this libs would be very useful for all arduino projects, i hope you don't mind if i (try to) port this lib to arduino

additional info: i (finally) uploaded to arduino library manager, this is the repo link

https://gitlab.com/atesin/XxHash_arduino

Won't https://github.com/Cyan4973/xxHash just run on arduino pretty much as is? Especially have a look at XXH_NO_STDLIB, resp. the comment in https://github.com/Cyan4973/xxHash/blob/b51ffce60232af69f971732dcecd2bcb3ad4179d/xxhash.h#L1703. To me it would seem that that'd be the best path towards using xxhash in an embedded environment like arduino - the implementation in this repo is meant for compile-time evaluation only and I would expect it to be significantly slower than the proper C implementation - and I'd guess that'll be especially relevant for embedded environments...

i dont know, maybe...... i found official Cyan4973 xxHash files so huge and complicated, and made for another C++ version... arduino uses a stripped down version of C++ 17 (avr-libc)

i don' know how to port it

While it is indeed not exactly small, https://github.com/Cyan4973/xxHash/blob/dev/xxhash.h is a header-only implementation in plain portable C, so I wouldn't expect much porting to be needed at all :-).

My guess would be that you can just plainly include that file and use it right away, e.g. with

#define XXH_INLINE_ALL
// the following the define's may not even be necessary, since avr-libc probably to provide malloc()
#define XXH_NO_STREAM
#define XXH_NO_STDLIB
#include "xxhash.h"

you should be able to use e.g. their XXH32 and XXH64 functions without any issues (see https://github.com/Cyan4973/xxHash#build-modifiers for those options).

In any case - feel free to reuse the code in this repository, but I'd recommend giving the original implementation a try, unless you merely need compile-time hashing. For runtime hashing, you can expect https://github.com/Cyan4973/xxHash to perform much better, though.

hi Daniel, thanks for your interest and help

i followed your advices, could be able to port Yann Collet's reference xxHash implementation xxhash.h header file --with no modificacions at all--, tried XXH32algo and worked very well :)

however i couldn't be able to make XXH64 work, i suspect is due XXH_INLINE_ALL modifier in arduino platform (i think is a luxury anyway)

i will keep you in sync about any news

update:

i could be able to make XXH64 work, but as is a bit more complicated and consumes more arduino valuable resources (and is not really essential) i left it behind for now

additional info: XXH64 uses uint64_t data type internally that is partially supported in arduino uno (no long long typedef nor print() or printf() support for example)... anyway this is the function i used to fill a char array with XXH64 hexadecimal representation, for the curious minds:

// create this output buffer before with --minimum-- 17 bytes available
char output[17];

// considering <input> as the pointer to char array where your original data is
// reentrant version to avoid memory leaks (have to pass an external buffer and fills it directly)
// returns the same output buffer pointer
char* xxh64(char* output, const char* input)
{
  // vsprintf() in avr-libc does not have "ll" support
  // https://www.avrfreaks.net/comment/409774#comment-409774
  
  union
  {
    XXH64_hash_t longlong;
    uint32_t longs[2];
  }
  combined;
  
  combined.longlong = XXH64(input, strlen(input), 0);
  sprintf_P(output, (PGM_P) F("%08lX%08lX"), combined.longs[1], combined.longs[0]);
  return output;
}

i finished to migrate the arduino xxHash lib, from this header file to Jann Collet header file... worked very good

see the issue 700 mentioned above at original xxHash repo

thanks for your help, support and motivation