Writer: use smaller page sizes? (medium)
Closed this issue · 2 comments
gaborcsardi commented
At least we need to break columns into multiple pages if they are large.
gaborcsardi commented
DuckDB writes row groups with 122,880 rows.
The Parquet specs suggests row groups of 512MB-1GB. They also suggest a page size of 8KB, which seems way too low for me.
FWIW Arrow seems to write pages up 1M, so that should surely be fine.
gaborcsardi commented
Closed by #29.