mgedmin/objgraph

show_growth prints incorrect results if peak_stats not passed

jessehersch opened this issue · 4 comments

the problem

show_growth() relies on the strange behavior of taking a mutable as the default value of an arg. That leads to a bug where incorrect results are returned. At least they are incorrect to my understanding. Could be my understanding is wrong though :)

See repro below.

objgraph version: 3.4.1

repro:

import objgraph


class Something:
    def __init__(self, a, b, c):
        self.a = a
        self.b = b
        self.c = c


def main():
    works()
    broken()


def works():
    # if pass peak_stats it works better:
    print('\n------------this works-------------')
    peak_stats = {}
    objgraph.show_growth(shortnames=False, peak_stats=peak_stats)
    peak_stats = objgraph.typestats(shortnames=False)

    x = [Something(i, i ** 2, str(i)) for i in range(1000)]
    print(f'\nshould have 1000 somethings now. first one is {x[0]}')
    objgraph.show_growth(shortnames=False, peak_stats=peak_stats)
    peak_stats = objgraph.typestats(shortnames=False)

    del x
    print(f'\nsomethings should be gone now:')
    objgraph.show_growth(shortnames=False, peak_stats=peak_stats)
    peak_stats = objgraph.typestats(shortnames=False)

    x = [Something(i, i ** 2, str(i)) for i in range(1000)]
    print(f'\nshould have 1000 somethings again, and we do. first one is {x[0]}')
    objgraph.show_growth(shortnames=False, peak_stats=peak_stats)
    peak_stats = objgraph.typestats(shortnames=False)


def broken():
    print('\n------------this is broken-------------')
    objgraph.show_growth(shortnames=False)
    x = [Something(i, i ** 2, str(i)) for i in range(1000)]
    print(f'\nshould have 1000 somethings now. first one is {x[0]}')
    objgraph.show_growth(shortnames=False)

    del x
    print(f'\nsomethings should be gone now:')
    objgraph.show_growth(shortnames=False)

    x = [Something(i, i ** 2, str(i)) for i in range(1000)]
    print(f'\nshould have 1000 somethings again, but we do not. first one is {x[0]}')
    #
    # the bug is provoked on this call.
    #
    # it should print a line with "__main__.Something     1000     +1000"
    #
    # instead it prints nothing. I think this is due to relying on the strange behavior
    # of specifying a mutable as a default value for an arg, in this case the peak_stats arg of show_growth()
    #
    objgraph.show_growth(shortnames=False)
    print(x[0])


if __name__ == "__main__":
    main()

which produces something like this:

------------this works-------------
builtins.function                       2261     +2261
builtins.tuple                          1885     +1885
builtins.dict                           1425     +1425
builtins.wrapper_descriptor              998      +998
builtins.weakref                         894      +894
builtins.set                             805      +805
builtins.method_descriptor               732      +732
builtins.builtin_function_or_method      702      +702
builtins.list                            551      +551
builtins.getset_descriptor               408      +408

should have 1000 somethings now. first one is <__main__.Something object at 0x7fdcef2726a0>
__main__.Something     1000     +1000
builtins.frame            7        +2
builtins.list           552        +1
builtins.cell            42        +1

somethings should be gone now:
builtins.frame        7        +2
builtins.cell        42        +1

should have 1000 somethings again, and we do. first one is <__main__.Something object at 0x7fdcee6986a0>
__main__.Something     1000     +1000
builtins.frame            7        +2
builtins.list           552        +1
builtins.cell            42        +1

------------this is broken-------------
builtins.function                       2261     +2261
builtins.tuple                          1864     +1864
builtins.dict                           1425     +1425
builtins.wrapper_descriptor              998      +998
builtins.weakref                         894      +894
builtins.set                             805      +805
builtins.method_descriptor               732      +732
builtins.builtin_function_or_method      702      +702
builtins.list                            551      +551
builtins.getset_descriptor               408      +408

should have 1000 somethings now. first one is <__main__.Something object at 0x7fdcedfabbe0>
__main__.Something     1000     +1000
builtins.list           552        +1

somethings should be gone now:

should have 1000 somethings again, but we do not. first one is <__main__.Something object at 0x7fdcedf3b8d0>
<__main__.Something object at 0x7fdcedf3b8d0>

This is actually working broken working as designed, because show_growth() shows

Show the increase in peak object counts since last call.

(emphasis mine)

If you change your broken() function to create 1002 somethings in the second list, you'll see

...
------------this is broken-------------
builtins.function                       1793     +1793
builtins.wrapper_descriptor             1039     +1039
builtins.dict                            929      +929
builtins.builtin_function_or_method      792      +792
builtins.method_descriptor               751      +751
builtins.weakref                         581      +581
builtins.tuple                           544      +544
builtins.getset_descriptor               372      +372
builtins.member_descriptor               303      +303
builtins.type                            145      +145

should have 1000 somethings now. first one is <__main__.Something object at 0x7f3afbb50110>
__main__.Something     1000     +1000
builtins.list           126        +1

somethings should be gone now:

should have 1000 somethings again, but we do not. first one is <__main__.Something object at 0x7f3afb9c8750>
__main__.Something     1002        +2
<__main__.Something object at 0x7f3afb9c8750>

IIRC I chose to do this because I was mostly interested in hunting memory leaks (objects that never go away), and objects where the counts go up and down all the time but never go beyond a certain fixed peak level used to produce too much noise in the output.

I wish I were rich and could hire a technical writer to make the documentation clearer...

understood. from my perspective showing the current counts is more useful, but I can implement that myself easily enough.

I didn't realize until now that it was showing peak counts, although I should have realized it since the param in growth() that takes the mutable as a default is called peak_stats :)

actually not sure I agree that is working correctly because once 1000 somethings are created again after all have been destroyed, then delta should be +1000 again. but it's not. it's simply missing. If that's by design, ok. that behavior is strange to me though.

The delta shown is always relative to the last peak value.