Increasing performance for special cases?
Opened this issue · 1 comments
Hello,
I am using cl-conspack via cl-mpi in a distributed-memory parallel environment, and I observe that it takes a significant time of my communication overhead. I can tailor my communications so that I transmit only one vector with special element types 'double-float and (unsigned-byte 32), however I need both of them. Would it be possible to speed up conspack encoding/decoding for those special kind of vectors? Or is this an impossible request, because it becomes too implementation-dependent then?
Thank you,
Nicolas
Speedups in conspack (and fast-io) are certainly possible, mostly revolving around cases like this with large homogenous vectors. The main killer right now is it still has to encode element-by-element. Fixing this would likely require a couple things.
- Implementing endian switching, because I made the horrible mistake of using network byte order to start with. This is probably not really hard—two additional codes—and it should be mostly-transparent to use.
- Direct dumps to and from arrays, probably using static-vectors, to speed up copying. This will probably slightly more intervention from the end user, but in cases such as yours, highly beneficial and possibly already in place.
I probably don't have the time to implement this myself, but I'd be available to point out where things should likely go should someone want to work on it. Overall it's a fair but probably not large amount of work.