Incorrect result when packing unpacking a recarray with padding bytes
Opened this issue · 3 comments
fyrestone commented
The output data should be correct, however, some weird data are generated.
import blosc2
import numpy as np
print(blosc2.__version__)
print(np.__version__)
dtype = {
"names": ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l"],
"formats": [
"<u8",
"<i8",
"<i8",
"<u8",
"<i4",
"<u4",
"<u4",
"<i2",
"i1",
"i1",
"i1",
"<u8",
],
"offsets": [0, 8, 16, 24, 32, 36, 40, 44, 46, 47, 48, 56],
"itemsize": 64,
"aligned": True,
}
arr = np.recarray(100, dtype=dtype)
print(type(arr), arr.dtype)
arr2 = blosc2.unpack_tensor(blosc2.pack_tensor(arr))
print(type(arr2), arr2.dtype)
Output
3.0.0b4
1.26.4
<class 'numpy.recarray'> (numpy.record, {'names': ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l'], 'formats': ['<u8', '<i8', '<i8', '<u8', '<i4', '<u4', '<u4', '<i2', 'i1', 'i1', 'i1', '<u8'], 'offsets': [0, 8, 16, 24, 32, 36, 40, 44, 46, 47, 48, 56], 'itemsize': 64, 'aligned': True})
<class 'numpy.ndarray'> [('a', '<u8'), ('b', '<i8'), ('c', '<i8'), ('d', '<u8'), ('e', '<i4'), ('f', '<u4'), ('g', '<u4'), ('h', '<i2'), ('i', 'i1'), ('j', 'i1'), ('k', 'i1'), ('f11', 'V7'), ('l', '<u8')]
You can see an additional column f11
was added, what is it ?
FrancescAlted commented
Yes, I can reproduce this. If you can find the root of the issue, shout!
fyrestone commented
I think there are some padding bugs, if I remove these params, then the output is good:
import blosc2
import numpy as np
print(blosc2.__version__)
print(np.__version__)
dtype = {
"names": ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l"],
"formats": [
"<u8",
"<i8",
"<i8",
"<u8",
"<i4",
"<u4",
"<u4",
"<i2",
"i1",
"i1",
"i1",
"<u8",
],
# "offsets": [0, 8, 16, 24, 32, 36, 40, 44, 46, 47, 48, 56],
# "itemsize": 64,
# "aligned": True,
}
arr = np.recarray(100, dtype=dtype)
print(type(arr), arr.dtype)
arr2 = blosc2.unpack_tensor(blosc2.pack_tensor(arr))
print(type(arr2), arr2.dtype)
fyrestone commented
The f11
field is the padding hole, shown as ?
, I don't know why it becomes a column.
0 8 16 24 32 40 48 56 64
|--------|--------|--------|--------|--------|--------|--------|--------|
|aaaaaaaa|bbbbbbbb|cccccccc|dddddddd|eeeeffff|gggghhij|k???????|llllllll|