alex-sherman/deco

High memory usage - how to reduce?

Closed this issue · 3 comments

deco seems to make my program use more memory than I expected (regardless of the number of workers)

Let's say I have code like this (left out details for clarity)

@concurrent
def slow(index):
    ... do something

def run():
    for index in iterator_w_200K_items:
        slow(index)
    slow.wait()

It seems like the iterator is being read all the way through at once (and pending jobs created). So it's using too much memory. (To verify I replaced iterator_w_200K_items with iterator_w_2K_items and memory usage went way down.)

Is there a way I can have deco work in smaller sized chunks?

I hope that makes sense.

Perhaps there's a way to call wait part way through and when the first batch finishes, continue to the next?

That idea seems to be working actually. I'll go with that unless you see any problems with the approach.


@concurrent
def slow(index):
    ... do something

def run():
    for chunk in chunk(iterator_w_200K_items):
        for item in chunk:
            slow(index)
        slow.wait()
    slow.wait() # clear up stragglers?

That seems to be the best way to approach the problem, it doesn't seem like multiprocessing.Pool has any mechanisms to deal with this sort of problem either. It would be possible to include this functionality in deco, like allowing you to define a batch size or something. However, I think since this solution you've provided exists, and seems rather straight forward, I'll leave it out in favor of simplicity.

Thanks for mentioning this by the way, I hope this can help others if they find themselves in a similar situation.