varargs for hashing?
tantaman opened this issue · 6 comments
I noticed that the hash functions only accept 1 argument
Lines 200 to 204 in ab0e7e2
but the underlying implementations do appear to be able to update
an existing hash with more data.
E.g., from sha1.c
Lines 202 to 221 in ab0e7e2
Could the extension be updated to accept varargs for hash code generation? E.g., for the use case of hashing an entire selection set?
cc @schickling
What do you mean by "use case of hashing an entire selection set"? Please provide an example.
SELECT sha1(*) FROM foo;
where sha1
here would return the hash code of all the cells taken together for each row.
E.g.,
.mode column
CREATE TABLE foo (a, b, c);
INSERT INTO foo VALUES (1,2,3), (4,5,6), (7,8,9);
SELECT sha1(*) as hash FROM foo;
hash
----
row1_hash
row2_hash
row3_hash
Thanks!
That's not exactly how SQL works. To apply a function (other than count
) to all columns in a row, you must list them explicitly:
select sha1(a, b, c) as hash from foo;
Could you please clarify why it would be helpful to compute a single hash over multiple columns? Do you have a real-world use case?
Do you have a real-world use case?
@schickling does
He wants to compare whether or not new query results differs from his previous query results. Other ideas would be:
- compare each item in the results, cell by cell
- add counter columns to each table that are incremented on any mutation of a row. This of course get problematic when joining many tables since counters need to be selected and concatenated.
- any other ideas?
I'm not sure I follow. Let's say sha1
can compute the hash over multiple columns (sha1(a, b, c)
). What do I do with that? What exactly does it mean "to compare whether or not new query results differs from his previous query results"? What are the "query results" and "previous query results" here? I'd appreciate a complete example.
All right, then 🤷♀️