google/stenographer

BlockFile.AllPackets() fail when using non-standard blocksize

emillynge opened this issue · 0 comments

Dear Maintainers

I have a use case where I need the AF_PACKET blocks to be larger than the default of 1Mb (Disk IO performance related)

This however seems to cause issues with queries with only time related filters. This is because base.AllPositions uses a shortcut that sidesteps the index file completely and calls BlockFile.AllPackets.

Unfortunately, allPacketsIter.Next() reads in blockData in chunks hardcoded to 1 << 20 (1Mb) and thus creates an out of bounds exception if the actual block is larger than 1Mb.

Problem is, that the tpacket_hdr_v1 does not provide any information about how large the requested block is.

//if_packet.h

	/* Number of valid bytes (including padding)
	 * blk_len <= tp_block_size
	 */
	__u32	blk_len;

That is, even if we request blocks of 1Mb from the kernel, then the blk_len can be anything from 0-1Mb and thus stenographer cannot infer the offsets between blocks from the headers.
A solution could be to not write the full 1Mb block, but only write a truncated block of size blk_len.

This would both save disk space and make it possible to handle many different block sizes. It would however increase the complexity of some of the code in blockfile.go since we cannot blindly read 1Mb from disk. But that problem more or less exists anyways if some user sets --blocksize_kb=512.

CORRECTION

In the interest of understanding te history of this issue, I will not rewrite the above. But I now realize that the sentence:

blocks of 1Mb from the kernel, then the blk_len can be anything from 0-1Mb and thus stenographer cannot infer the offsets between blocks from the headers.

is entirely incorrect. Indeed, the source code I quoted explicitly contradict this: /* Number of valid bytes (including padding)
i.e blk_len will be 1Mb even if it contains no packet at all.

As such, the proposed solution is perfectly viable, and I have provided a PR to implement it. #236