bug in test if hashable
Closed this issue · 5 comments
In this line, you test whether the args are hashable: https://github.com/brmscheiner/memorize.py/blob/master/memorize/memorize.py#L80
but since args is a tuple there, it will always be hashable, even if it contains unhashable types (for example if the functions takes a single list as argument, args might look like ([1],) , which is an instance of collections.Hashable but turns out not to be hashable.
Proposed solution: remove these lines altogether and put this in a try/except TypeError block. Or if this is intended behaviour, please tell me what I've misunderstood.
(I got here through https://wiki.python.org/moin/PythonDecoratorLibrary, if anyone who reads this wants to fix the various versions there too, please do)
Thanks @JoranDox, nice catch! I agree, but I don't understand the proposed solution. Could you submit a PR and show me?
Here you go: #4
Like I mention in the pull request (and how we solved our version), another option would be to use str(args) as key if it's not hashable. But with the risk of two different objects having the same string representation, or even one object having multiple string representations (looking at you, pandas dataframes), I wouldn't make that the default. Maybe based on an argument like @memorize(string_if_unhashable=False)
@JoranDox I liked the PR and approved, but I don't like the idea of comparing str(args)
. I know it's probably the easiest way to compare non-hashable arguments, but like you mentioned there are many places things could go wrong.
I am more inclined to look into lean libraries for deep comparison like you mentioned, or to allow users to provide a function that maps non-hashable arguments to some sort of hashable key.
Usage could be something like
def getHash(arg1, arg2):
return str([arg1, arg2])
@Memorize(getHash)
def myFunction(arg1, arg2):
....
Thoughts?
Good point, making it an argument and keeping python's hash
as default would preserve the current case but make it easily extendable to non-hashable types. And this made me google something and lo and behold: pandas.util.hash_pandas_object() exists!
Oh that's tight! Alright I'm gonna create an issue and hopefully take a stab at this soon (I mostly write javascript these days)