ilanschnell/bitarray

Bitshift operations endian sensitive

Closed this issue · 3 comments

It appears that bitshift operations within bitarray are endian sensitive:

>>> from bitarray.util import int2ba
>>> a = 9
>>> a << 1
18
>>> a_le = int2ba(a, 8, endian="little")
>>> a_be = int2ba(a, 8, endian="big")
>>> a_le << 1
bitarray('00100000')
>>> a_be << 1
bitarray('00010010')

While I can see the logic in this, Python does not behave this way. Specifically the relevant section states that a right shift is a division by pow(2.n) and a left shift is a multiplication by pow(2,n). That is, 'left shift' is always performed relative to the big-endian representation.

I've been having a quick look, and I've not been able to quickly track down what the C standard says about this issue (and I don't have a Big Endian machine to hand for experimentation). At the very least, I do feel that this behaviour should be called out in the docs, since it appears to be different from what Python itself does

Thanks for reporting this issue! I would argue that the relevant section in the Python documentation you refer to explicitly states "These (shift) operators accept integers as arguments", and not arrays. The C behavior if shift operations is the same as in Python (also on big-endian machines), but again these operations apply to C integers. The bitarray shift operations themselves are not enidan sensitive:

>>> bitarray('00100000', 'big') << 1
bitarray('01000000')
>>> bitarray('00100000', 'little') << 1
bitarray('01000000')

However, when you interpret these bitarrays as integers the shift operations become endian sensitive. Not because the shift operation itself is endian sensitive, but because int2ba is endian sensitive (as it should be):

>>> int2ba(9, 8, endian="little")
bitarray('10010000')
>>> int2ba(9, 8, endian="big")
bitarray('00001001')

Would you agree with this explanation?

I agree it make sense; I actually think that 'left shift' and 'right shift' are terrible names for precisely this reason ('shift towards LSB' and 'shift towards MSB' would be unambiguous).

I would suggest tweaking the documentation to make it plain that this is how you were interpreting what << and >> mean in the context of bitarray - the former is 'shift towards a[0]' and the latter is 'shift towards a[n-1]'.

Thank you for a very useful library!

Yes, I agree that in general (when applied to integers) the terms "left shift" and "right shift" are not very good because they always assume big-endian notation integers (also on little-endian machines).

Following your suggestion, I've added a not in the documentation. Thanks Again!