allow uuid_str() to take any string or blob
terefang opened this issue · 8 comments
today this happens:
sqlean> select uuid_str(md5('x'));
9dd4e461-268c-8034-f5c8-564e155c67a6
sqlean> select uuid_str(sha1('x'));
sqlean> select uuid_str(sha3('x'));
sqlean> select uuid_str(sha256('x'));
sqlean> select uuid_str(sha512('x'));
sqlean>
if the string or blob is at least 16 bytes long uuid_str() could just take the first 16 bytes and ignore the rest.
in addition a shortcut could be dedicated functions like:
- uuid_str_md5(data)
- uuid_str_sha1(data)
- uuid_str_sha3(data)
- uuid_str_sha256(data)
- uuid_str_sha512(data)
uuid_str
only works with valid UUIDs. And why would you want to create a UUID from the first 16 bytes of the SHA-256 hash? What's the use case here?
like me reference UUIDv3, v5 and v8:
A UUID is generated based on an unspecified name. Names are unique identifiers for an object, resource or similar within an assigned namespace. Starting from a UUID for the namespace, a UUID is generated from the name by forming a byte sequence from the namespace UUID and the name itself and then hashing this byte sequence using MD5 or SHA1. The hash is then distributed among the available UUID bits in a defined manner.
- UUIDv3 are created from the output of a MD5 hash
- UUIDv5 are created from the output of a SHA1 hash
- UUIDv8 are created from arbitrary byte sequences.
one could call the function more precisely:
- uuidv3_str_md5(data)
- uuidv5_str_sha1(data)
- uuidv8_str_sha3(data)
- uuidv8_str_sha256(data)
- uuidv8_str_sha512(data)
my particular real-world use case is that i create "stable unique ids" out of the concatenation of various text-fields in the row, which are then easier to join and reference and can also act as safe ids in other protocols (like rest-urls).
today i have to do this outside of sqlite with scripting re-writing the csv-import.
it would be much simpler to do the csv-import and then issue UPDATE table SET XID=uuidv8_str_sha512(f1 || f2 || f3)
.
i have looked at the code ... a quick win would be to just check if the parameter is "at least" and not "exactly" 16 bytes.
@nalgeon would you agree ?
Sorry, I don't like the idea of uuid_str
truncating its argument. You can always use substr(x, 1, 16)
on the hashcode and then call uuid_str
on the result.
would you accept a pull request for uuid_str_<HASHALGO>
?
No, I don't think so, sorry.
i dont understand
- i have presented a use-case
- i have given standard reference
i was only interested of also making other users lifes easer.
sqlean is a really useful contribution to the sqlite community and i hope you keep up the existing work.