HUH: ``squeeze``ing scalars
Opened this issue · 7 comments
I wasn't sure if this should be an "ENH", "BUG", "TYP", "DEP", or "MAINT", so I opted for the sound that came out of my mouth when I first encountered this. Feel free to change it to one of those if you feel like it.
This just made me "huh" out loud:
>>> np.squeeze(1)
array(1)
>>> np.squeeze(np.array(1))
array(1)
>>> np.squeeze(np.int_(1))
np.int64(1)So "scalar-likes" are not treated equally; some return numpy scalars, others return 0d arrays.
Besides this increasing the ever so important "huh/sec" rate 1, it's also pretty annoying to express in the stubs. But before you panick and/or call the press, the squeeze stubs aren't incorrect 2 3, so it's not that big of a disaster.
Anyway, given the array-api's lack of scalar-types, and them being infuriatingly annoying for static typing, I think that I'm slightly leaning towards choosing 0d-arrays over scalars here. I.e., changing squeeze so that it always returns an instance of numpy.ndarray (or a subtype thereof), even if you pass it a np.generic scalar-type thingy.
This would technically be a backwards-incompatible breaking change. But, considering that 0d arrays are mostly duck-type compatible, and that I don't see why anyone would want to squeeze their scalars in the first place, I doubt that many will be bothered by this change.
TLDR; Let's have numpy.squeeze always return an array
Footnotes
squeeze is one of those annoying functions that forwards to the method via obj.squeeze and if that fails convert to an array.
So the problem with changing this isn't changing scalars, it's that np.squeeze(dataframe) might work by calling dataframe.squeeze()...
So the problem with changing this isn't changing scalars, it's that
np.squeeze(dataframe)might work by callingdataframe.squeeze()...
But there's no builtins.int.squeeze() method, so why does it become an array?
Either way, this could then be "fixed" by changing np.generic.squeeze() to return a 0d-array, no?
Hello! Can I work over this problem?
We don't assign issues. Just be sure to link back to this issue in the PR so others will know about the PR. I think it would be prudent to wait a little more before diving in, to make sure the suggested fix will be acceptable.
to return a 0d-array, no?
I suppose we can do that, yes. NumPy tries to use the method and if it doesn't exist converts to array first.
@jorenham I was looking into the implementation of np.squeeze, and here’s how I thought we could address this issue:
def squeeze(a, axis=None):
"""
Docstring
"""
+ if isinstance(a, np.generic):
+ a = np.asanyarray(a)
try:
squeeze = a.squeeze
except AttributeError:
return _wrapit(a, 'squeeze', axis=axis)
if axis is None:
return squeeze()
else:
return squeeze(axis=axis)Is this the correct way to approach this problem or do you suggest to look even deeper?
@jorenham I was looking into the implementation of
np.squeeze, and here’s how I thought we could address this issue:def squeeze(a, axis=None): """ Docstring """ + if isinstance(a, np.generic): + a = np.asanyarray(a) try: squeeze = a.squeeze except AttributeError: return _wrapit(a, 'squeeze', axis=axis) if axis is None: return squeeze() else: return squeeze(axis=axis)Is this the correct way to approach this problem or do you suggest to look even deeper?
Assuming that we decide to indeed have squeeze always return ndarray, then something like this would probably be the way to implement it, yes. There might be slightly more efficient ways to convert a scalar to a 0d array, but I might be wrong.