thunder-project/thunder

Series binary read/write round-trip fails

jwittenbach opened this issue · 0 comments

Writing Series data to disk with series.tobinary() and then reading it with series.frombinary() can cause the data to come back in a jumbled order when using multiple partitions in spark mode.

The problem is that the writer reverses the labels on written files (which correspond to the index of the first record contained in that file).