dmlc/rabit

Reducing objects whose size is not known at compile time

Closed this issue · 1 comments

Right now the interface of rabit AllReduce requires that the items being reduced are POD types whose byte size is know at compile time.

I think the reason for that is that internally it uses MPI_ALLREDUCE which also requires POD data types and the count of items to calculate the memory cost as sizeof(DATA_TYPE) * count.

My question is this: Are there any constraints for performing reductions on objects whose size is only known at runtime? I'm pretty sure it's not possible with the current codebase, I'm just wondering if I'm missing something that would make this impossible at the MPI side (I know for example that there are not sparse collectives in MPI).

This mainly have things with to do with the protocol, many allreduce algorithms need to divide up the work, which means that the size of objects(or at least maximum size of the object) need to be known ahead of time.

We do support things like

class SerializeReducer {
which support reduction of objects(assuming we know maximum number of bytes for serialization).

One danger is that the size of object can blow up during reduction(after allreduce is called), and this prevents things like pre-allocating buffers statically