aconrad/pycobertura

Running out of memory

kannaiah opened this issue · 4 comments

I'm using pycobertua to read morethan 1400 coverage files.
My script is being killed as the memory usage is keep on increaing with file, my code looks like below

for file in covfiles:
    cov = cobertua.Cobertua(file)
    for f in cov.files():
        for line in cov.hit_statements(f):
            ***
            ***

I tried del cov but that didn't help.
I think it is because of @memoize, my script runs fine when I comment out @memoize.

Hi @kannaiah, thanks for reporting this! I really appreciate it.

Yeah I don't know if memoize makes much of a performance difference. I honestly wrote it without specifically having a performance problem, it just felt like it was doing a lot of the same XML parsing work so I wrote memoize but I realize it might have been premature optimization.

Since you have a lot of files, would you mind running benchmarks with and without memoize and report back if there's a big pricrssing difference? Obviously, since you are running out of memory, you'd have to cut down the number of files you are parsing and find the sweet spot in order for the test runs to complete.

Obviously, memoize could be useful and we'd have to find out what's going on for the memory to not be freed once you are done iterating over a coverage file. Do you want to take a look?

I see the memoize is from
The comment does say that the cache would not be freed and
https://gist.github.com/codysoyland/267733/8f5d2e3576b6a6f221f6fb7e2e10d395ad7303f9#gistcomment-17644

And there is a link to better memoize http://code.activestate.com/recipes/577452-a-memoize-decorator-for-instance-methods/ which I tested and seems to work.

Awesome, thanks for looking into it! Can you submit a pull request if you have a chance?

Fixed by #89