mcollina/msgpack5

be abstract-encoding compatible

mafintosh opened this issue · 8 comments

Would be awesome if this was abstract-encoding compatible, https://github.com/mafintosh/abstract-encoding. Mainly needs a msgpack.encodingLength method.

Yes it would be! However the internal design is not not built to handle this case, as it is based on bl, so the actual length is computed after everything. Damn.

What's the need of encondingLength()? After we have a Buffer, we have the length.

It's also need support for encoding.decode.bytes.

Anyway, I'll be super happy to merge any PRs that handles this.

encodingLength is incredible useful for preallocating a buffer that contains multiple encoded messages without having to do a bunch of small allocations and them concatting them. There is a decent chance this might actually speed up your code as well. It did so for my protobuf stuff

@mafintosh I agree with you. I just don't have time :(, as I am not currently using this directly.
How do you do that with JSON?

I don't do it for JSON. Mostly for all my other binary encoding stuff

It's impossible to implement encodingLength here without actual encoding. But I have an idea. We can simply store last result and when encode is invoked afterwards just return the previously computed one.

By monkey patching this can be done as follows:

function decorate(encoding) {
  const _recent = new WeakMap();
  const _encode = encoding.encode;
  
  function encondingLength(obj) {
    try {
       const encoded = _encode(obj)
       _recent.clear()
       _recent.set(obj, encoded)
       return encoded.length
    } catch () {
       return null;
    }
  }

  function encode(obj) {
    const encoded = _recent.get(obj)
    if (!encoded) return _encode(obj)
    _recent.clear() 
    return encoded
  }

  return Object.assign(encoding, { encode, encondingLength })
}

To cover more cases it's reasonable to clear WeakMap once we encoded an obj not in map,

function decorate(encoding) {
  const _recent = new WeakMap();
  const _encode = encoding.encode;
  
  function encondingLength(obj) {
    try {
       const encoded = _encode(obj)
       _recent.set(obj, encoded)
       return encoded.length
    } catch () {
       return null;
    }
  }

  function encode(obj) {
    const encoded = _recent.get(obj)
    if (encoded) return encoded
    _recent.clear() 
    return _encode(obj)
  }

  return Object.assign(encoding, { encode, encondingLength })
}

So that we could (just silly example):

const lengths = array.map(encoding.encondingLength);
const encods = array..filter((_it, i) => lengths[i] < 100).map(encoding.encode);

without perf penalty.

Would you like ti send a PR?

I'd rather investigate how to implement the same idea internally and PR it. This code is just an illustration for the idea: abuse of monkey patching is not a nice thing.