xitongsys/parquet-go

A parquet format question: reading compressed pages

Closed this issue · 0 comments

Hi, @xitongsys , one generic parquet format question that you may already know the answer:

Assuming a column is compressed, and the page size is 4kB, to decompress one page, do we need to read all previous rows?
Basically, are all pages compressed together or by each page?

Tried to compress one column with various page size, the final parquet file size seems the same. This seems to suggest all pages are compressed together.