Matching C order for bit packed structs.
Closed this issue · 6 comments
import bitstring
from bitstring import Bits,pack
from collections import namedtuple
bitstring.options.lsb0 = True
types = "=B,bool,pad7"
data = pack(types,0xAA,True) #pack the bytes
for i in data.bytes: # print out each byte
print(hex(i))
python output:
0x1 0xaa
#include <stdio.h>
typedef struct {
unsigned char Q:8;
_Bool A:1;
int :7;
} stuff_t;
int main() {
// Write C code here
stuff_t B = {.Q=0xAA,.A=1}; //pack the bytes
for(int i=0;i<2;i++){ // print out each byte
unsigned char * D = (unsigned char*)(void*)&B;
printf("0x%02x",*(D+i));
}
return 0;
}
c results:
0xaa 0x01
The problem here is that they are reversed from what i would expected.
And it seems like here it would be as easy as reversing the bits. But it isn't. it messes with the endianness of stored data.
A uneasy hack that seems to makes these match is :
bitstring.options.lsb0 = True
reverse endianess on all Dtypes.
unpacking:
- reverse incoming data. data[::-1]
- unpack
packing:
- pack input data
- reverse byte order before sending. data[::-1]
I'm hoping there is a better way.
This doesn't really work either cause it reverses strings.
Hi.
I think that pack
is working as intended here. The LSB0 mode means that reads, slices etc. are happening from right to left, so for example the reverse operation
>>> data.unpack(types)
[170, True]
is what you'd expect. When the data is unpacked here it does the right-most byte first, then the next bit. This is correct and I think it's reasonable that pack should behave in the same way.
I think there is some confusion because for this particular case reversing the byte order exactly reverses the effect of LSB0 (as both things are contained in one byte). Reversing byte order won't work in general.
Your C code is implicitly MSB0, so possibly you don't need LSB0 mode at all?
Your C code is implicitly MSB0
Hmm, I'm wondering if that's correct because when I do the following:
#include <stdio.h>
#include <stdint.h>
typedef struct {
_Bool A:1;
_Bool B:1;
_Bool C:1;
_Bool D:1;
int :4;
} myTest_t;
int main()
{
myTest_t DATA = {.A=1,.B=1,.C=1,.D=1};
uint8_t Printer = *(uint8_t*)(void*)&DATA;
printf("0x%02x\n",Printer);
return 0;
}
you can check it for yourself here: https://onlinegdb.com/G4g3AkgBc
MSb0 i would expect:
0xF0
A B C D pad
LSb0 i would expect:
0x0F
```pad D C B A``
It might be that I'm misinterperting something, but this is how I'm familiar with c compilers behaving.
OK, I think you're right and hopefully I get it now. The C compiler is packing the bitfields in a LSB0 manner (this is I think compiler dependent), but only the bitfields. The larger byte structures are still being packed left-to-right.
So in the original struct
typedef struct {
unsigned char Q:8;
_Bool A:1;
int :7;
} stuff_t;
the char
comes first, then the bitfields are packed LSB0 so it's the int
followed by the _Bool
. So reading left-to-right it's the first member of the struct, then the third, then the second. That's how you get the final 0xaa 0x01
output.
In bitstring with LSB0 the equivalent does the 3rd then 2nd then 1st field (reading left-to-right). For MSB0 it does it the opposite way round. So neither will match the C code.
I don't think there's anything that bitstring can do here - the packing rules of a particular C compiler are a bit out of scope. I think I would keep it in MSB0, and then try to reverse the C-packed bitfields for whichever rules the compiler has. In general a mix of bit and byte endianness!
Its somewhat interesting that this hasn't shown up more often. given that pythons' own ctypes assumes the same order for little endian chips.
i actually put together something to pack and unpack these using ctypes:
its actually a python file. but it gives an example of how the packing is assumed in c python. it matches gcc. armcc. and ctypes. for x86