msgpack/msgpack-python

Unpacker buffer re-use much worse since 0.6.1

mattp- opened this issue · 10 comments

Hi,
I'm a user of msgpack by way of saltstack; which leverages a msgpack Unpacker object for each long lived tcp connection. I know this isn't very exact, but after an upgrade to msgpack 1.0.5 from an older release running 0.6.1, the memory use skyrocketed. I've looked over the buffer handling but didn't see anything obvious in terms of how this is handled, but I don't have much experience with Cython. I did however confirm it was specifically the Unpacker object(s) by way of memray captures.

Is there any way to restore this more aggressive buffer re-use? alternatively, I've noticed you've made comments elsewhere that this isn't really a 'problem' with msgpack itself which I do understand; is there a way to know the buffer is 'empty' and safe to delete the unpacker object to free the memory? thanks 👍

@mattp- Sounds interesting (I use msgpack for borgbackup) - do you have an easy way to reproduce the issue? Ideal would be an automated way, like some executable returning 0 for ok and 1 for not ok.

Could you determine the msgpack release or even the git bisect the commit when the regression was introduced?

@ThomasWaldmann unfortunately not really. i think it might be death by a thousand cuts, because in the salt-master process it uses 1 unpacker object per minion connection, and there are 30-40k+ where I see the problems, so in isolation a single object doesn't have as much of a perceived problem, but over that scale it is adding up (then compounding because salt uses multiprocessing and forks, so the in use memory of this object cascades). The only thing I can see is the codepath in 0.6.1 anecdotally releases buffer its not using more aggressively.

Did you observe the regression using the same python version or was there a python upgrade at the same time as the msgpack ugprade?

the upgrade also included 3.7.x to 3.11.x/3.10.x ; but the fact that it 0.6.1 on 3.11 seems to revert to previous behavior suggests theres nothing python related going on - could be wrong though.

I don't have any idea. If you can not create reproducer, try 1.0.0 and 0.6.2.

@mattp- Also you could determine whether you use the fast (cython/c) msgpack implementation or the slow (pure python) one.

Usually it should be the fast one, but there can be circumstances with that one not being available and then it automatically falls back to the slow one.

Tried to reproduce here (with msgpack 1.0.7, fast and slow unpackb), but did not see anything suspicious yet (didn't run it for very long times though):

import sys
import msgpack
import msgpack.fallback
from msgpack import packb, unpackb, Packer, Unpacker

print(f"msgpack version: {msgpack.__version__}")
print(f"using slow pure python implementation: {unpackb is msgpack.fallback.unpackb}")

def make_array(size):
    return [i for i in range(size)]

def make_map(size):
    return {str(i): float(i) for i in range(size)}

data = [
    None,
    True,
    2**7-1, -2**5+1, 2**8-1, 2**16-1, 2**32-1, 2**64-1,
    23.42,
    "x" * (2**5-1), "x" * (2**8-1), "x" * (2**16-1), "x" * (2**16),
    b"y" * (2**8-1), b"y" * (2**16-1), b"y" * (2**16),
    make_array(2**4-1), make_array(2**16-1), make_array(2**16),
    make_map(2**4-1), make_map(2**16-1), make_map(2**16),
    # Timestamp type untested yet
    # ext types untested yet
]
packed = packb(data)
unpacker = Unpacker()

for i in range(10**9):
    unpacker.feed(packed)
    for unpacked in unpacker:
        assert unpacked == data, f"{unpacked} != {data}"
    if i % 1000 == 0:
        sys.stdout.write(".")
        sys.stdout.flush()

@ThomasWaldmann issue is in cython only; python fallback seems ok memory wise. just looking at your code, to be clear this is just an issue with the Unpacker object, which lets you stream data into it. packb/unpackb should be fine on either version.

@methane i'll try to bisect to find the first bad version; its going to take a while though due to needing to deploy and watch over time for it to re-occur. will report back here if I can narrow it down

@mattp- ok, i updated the code above.

Any new information?