nstepien/iltorb

Use BufferList instead of managing buffers within the code

Closed this issue ยท 8 comments

Hey,

Amazing work on the Brotli bindings for node.js, I just peeked into the code and saw that for the decompress and compress the collection of the buffers is being managed manually, since those are just wrappers around the streams. Could it be possible to include the Buffer List module. It implements the Buffer Readable Interface and saves from performing a big concat on the collection of buffers, leaving this functionality to the consumer. What are your thoughts about this? Thanks again ๐Ÿ‘

We already offer compression and decompression streaming, I'm not sure what we would gain from using Buffer List.

Can we gain something by implementing it not into the streaming methods but the other ones where we are preforming Buffer.concat instead of doing that we can returned the appended BufferList output, it would save us the concatenation operation. I don't have knowledge about whether Buffer.concat is fast or not but BufferList can improve the performance maybe, although there might be an overhead of the bufferlist object itself too.

Here is my input on the matter.

  1. Buffer.concat can cause a loss in performance in some cases, but we are performing this concatenation only once with an array of Buffers and providing it with the length so that it doesn't need to perform any loops to determine the total length. We are also not calling Buffer.concat multiple times as we generate an output. I don't see this as a huge impact in performance and we do have the stream API available as an alternative .

  2. bl seems to do a lot of things that isn't really necessary for what we need. I don't see any functional benefits here. It's still calling Buffer.concat when performing any type of slicing logic. The only difference would be that we would have an additional dependency that isn't entirely necessary.

  3. Changing this would result in a major breaking change in this library and that's not something I'm keen on doing. Currently, we are returning a single Buffer like the core zlib module and node-zopfli library.

In addition to what @oohnoitz said,

  1. We're trying to provide low level APIs so you can build anything on top of it, this is why we mimick the core zlib/no-zopfli APIs. You're free to use Buffer List on top of the streaming APIs if you so desire.
  2. I'd rather have as few dependencies as possible.
  3. In the future the bindings might change, and we could end up getting the entire buffer straight from brotli itself for the non-streaming APIs, so bl would be useless here.

Thanks for explaining the reasons @oohnoitz and @MayhemYDG . It does seem Buffer is the way to go here since it does not introduce any library and it might bloat the code for iltorb. Below, is merely my understanding of what I learned from the comments above ๐Ÿ‘

Depending on the use case BufferList might indeed cost higher as a concat would be needed to perform for each operation. My suspicion for concat arose from my limited knowledge of the node internals, but as @oohnoitz suggested a bl is an unnecessary dependency, and I do believe now it's better to do it once than again and again like for BufferList.

Will it break API though if BufferList is implemented since it implements the Buffer Readable, but since it's a third party module, Node API can get updated but BufferList might not so quickly, so that's one more plus for implementing Buffer.concat?

@MayhemYDG Does Brotli API not allow currently to get the complete buffer in one go, is streaming required, wasn't it the way it was done before they did the streaming, I also have a very little understanding of C++.

https://github.com/google/brotli/blob/5b4769990dc14a2bd466d2599c946c5652cba4b2/c/include/brotli/encode.h#L243-L270
Yes you can "get the complete buffer in one go", but it requires guessing the size of the output buffer, which is less than ideal.

Will it break API though if BufferList is implemented since it implements the Buffer Readable, but since it's a third party module, Node API can get updated but BufferList might not so quickly, so that's one more plus for implementing Buffer.concat?

Yes, it will break the API as Buffer List will return a bl instance and not a Buffer instance. You can also access the standard Buffer readable APIs available in the Buffer returned. The only thing is that we don't have any logic that transforms a Buffer into a Stream which can be resolved simply by using the iltorb stream APIs that is already available.

Again, I'm against adding a dependency to solve a problem that doesn't exist.

@oohnoitz Thanks for clearing this out, Yes I can definitely see how it'll break the Buffer.isBuffer calls, and it will require additional knowledge of how BufferList operates.
image