nomic-ai/deepscatter

Throw error on buffers of length > 16m

bmschmidt opened this issue · 0 comments

We don't define a limit on tile size, but there is a hardcoded parameter in MultipurposeBufferSet that assumes no single column will be more than 64MB. That means that tiles of size greater than 16 million or individual arrow record batches greater than that will completely break deepscatter. It's unlikely anyone would do that, but there should be an error which I don't think there is.

Note that the issue here is about individual tiles--more than 16m points in a dataset is completely fine.