Dubious assignment to be a check in hv_cli_and.c
Closed this issue · 12 comments
This assignment that will be a check is dubious. It is the same for the two cases. Suggest to fix and also have a test case that catches it, if possible. Does the code in nodetuple_richcompare
work as intended? I don't understand it after these years.
hv_cli_and.c, lines 235-236
vi = (Py_ssize_t)vt->ob_item[i];
**wi = (Py_ssize_t)vt->ob_item[i];**
Is it supposed to be wi = (long)wt->ob_item[i];
? Hmm... I'll check. (Got really busy with other stuffs recently)
How to get a reference to this class:
$ python
Python 3.7.5 (default, Oct 23 2019, 18:04:59)
[GCC 9.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import guppy.heapy.heapyc
>>> hv = guppy.heapy.heapyc.HeapView(guppy.heapy.heapyc.RootState, ())
>>> d = {}
>>> cli = hv.cli_and((), d)
>>> cli.classify(())
()
>>> type(next(iter(d.keys())))
<class 'guppy.heapy.heapyc.NodeTuple'>
>>>
Woah, that is more complicated than I had expected.
Yeah, it's definitely wrong.
$ python
Python 3.7.5 (default, Oct 23 2019, 18:04:59)
[GCC 9.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import guppy.heapy.heapyc
>>> hv = guppy.heapy.heapyc.HeapView(guppy.heapy.heapyc.RootState, ())
>>> f = lambda l: hv.cli_and(tuple(hv.cli_user_defined(hv.cli_none(), None, lambda obj: i, None) for i in l), {}).classify(None)
>>> a = f((1,)); b = f((2,)); a, b, type(a), type(b), a == b
((1,), (2,), <class 'guppy.heapy.heapyc.NodeTuple'>, <class 'guppy.heapy.heapyc.NodeTuple'>, True)
>>> a = f((1,)); b = f((2,)); a, b, type(a), type(b), a == b, hash(a) == hash(b)
((1,), (2,), <class 'guppy.heapy.heapyc.NodeTuple'>, <class 'guppy.heapy.heapyc.NodeTuple'>, True, False)
This guppy.heapy.heapyc.NodeTuple
seems really hard to access from normal Python space. The result from the classifier seems always abstracted away by guppy.heapy.UniSet.Kind
. Any idea how to test this?
I could test it by accessing the low level classifier. It is available at hpy().View.hv.cli_and
sverker@sverker-HP-Pavilion-g6-Notebook-PC:~/git/guppy-pe/bugs$ cat test_hv_cli_and.py
from guppy import hpy
h=hpy()
a=h.View.hv.cli_and((h.View.hv.cli_id(),), {})
assert a.classify(1)!=a.classify(2)
# Before fix
python
Python 2.7.15+ (default, Oct 7 2019, 17:39:04)
[GCC 7.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import test_hv_cli_and
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "test_hv_cli_and.py", line 4, in <module>
assert a.classify(1)!=a.classify(2)
AssertionError
# After fix
python
Python 2.7.15+ (default, Oct 7 2019, 17:39:04)
[GCC 7.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import test_hv_cli_and
>>>
I can see you accessed the low level classifier too in a previous comment where you used the user defined classifier too, I didn't understand it or read it well enough right away.
Mine was an alternative that was arguably easier to do.
Was your question how we can test it without accessing the low level classifier?
Was your question how we can test it without accessing the low level classifier?
Yeah. Like, does the abstractions above invoke nodetuple_richcompare?
I mean, if not, we could just test the low-level, or, we could test both high-level and low-level.
I find that it is called but I can not produce an error. It is used when partitioning a large enough set with the and classifier. But gdb gives strange results. The richcompare is called with the same results each time, presumably from the dict lookup when there is hash collisions, but why is it getting the same arguments each time? Is it something wrong with gdb interaction?
cat test3.py
from guppy import hpy
h=hpy()
al=[[] for x in range(100000)]
a0=al[:10000]
print h.idset(al).by(h.Id & h.Size) & h.idset(a0[:100]).by(h.Id & h.Size).kind
gdb python
GNU gdb (Ubuntu 8.1-0ubuntu3.1) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...(no debugging symbols found)...done.
(gdb) br nodetuple_richcompare
Function "nodetuple_richcompare" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (nodetuple_richcompare) pending.
(gdb) run
Starting program: /usr/bin/python
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
Python 2.7.15+ (default, Oct 7 2019, 17:39:04)
[GCC 7.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import test3
warning: Error reading shared library list entry at 0xffffb420
warning: Error reading shared library list entry at 0xffffbe60
Breakpoint 1, nodetuple_richcompare (v=0xb748c36c, w=0xb748c38c, op=2)
at src/heapy/hv_cli_and.c:202
202 {
(gdb) c
Continuing.
Breakpoint 1, nodetuple_richcompare (v=0xb748c36c, w=0xb748c38c, op=2)
at src/heapy/hv_cli_and.c:202
202 {
(gdb) c
Continuing.
Breakpoint 1, nodetuple_richcompare (v=0xb748c36c, w=0xb748c38c, op=2)
at src/heapy/hv_cli_and.c:202
202 {
(gdb) c
Continuing.
Breakpoint 1, nodetuple_richcompare (v=0xb748c36c, w=0xb748c38c, op=2)
at src/heapy/hv_cli_and.c:202
202 {
(gdb) dis 1
(gdb) c
Continuing.
Partition of a set of 100 objects. Total size = 3200 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 100 100 3200 100 3200 100 list
>>>
Woah
(venv) zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3 $ gdb -ex 'break nodetuple_richcompare' -ex r --args python test3.py
GNU gdb (Gentoo 8.3.1 vanilla) 8.3.1
[...]
Breakpoint 1, nodetuple_richcompare (v=(<type at remote 0x7ffff7d5e960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dce290>), w=(<type at remote 0x7ffff7d5e960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dce290>), op=2) at /home/zhuyifei1999/guppy3/src/heapy/hv_cli_and.c:205
205 Py_ssize_t vi=0, wi=0;
(gdb) bt
#0 nodetuple_richcompare (v=(<type at remote 0x7ffff7d5e960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dce290>), w=(<type at remote 0x7ffff7d5e960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dce290>), op=2) at /home/zhuyifei1999/guppy3/src/heapy/hv_cli_and.c:205
#1 0x00007ffff7c1c9f2 in do_richcompare (op=2, w=(<type at remote 0x7ffff7d5e960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dce290>), v=(<type at remote 0x7ffff7d5e960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dce290>)) at /usr/src/debug/dev-lang/python-3.7.5-r1/Python-3.7.5/Objects/object.c:695
#2 PyObject_RichCompare (v=v@entry=(<type at remote 0x7ffff7d5e960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dce290>), w=w@entry=(<type at remote 0x7ffff7d5e960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dce290>), op=op@entry=2) at /usr/src/debug/dev-lang/python-3.7.5-r1/Python-3.7.5/Objects/object.c:743
#3 0x00007ffff7a8610c in PyObject_RichCompareBool (op=2, w=(<type at remote 0x7ffff7d5e960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dce290>), v=(<type at remote 0x7ffff7d5e960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dce290>)) at /usr/src/debug/dev-lang/python-3.7.5-r1/Python-3.7.5/Objects/object.c:751
#4 lookdict (mp=0x7ffff5ddf910, key=<optimized out>, hash=<optimized out>, value_addr=0x7fffffffba90) at /usr/src/debug/dev-lang/python-3.7.5-r1/Python-3.7.5/Objects/dictobject.c:757
#5 0x00007ffff7c30bc1 in PyDict_GetItem (op={(<type at remote 0x7ffff7d5e960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dce290>): (...)}, key=<optimized out>) at /usr/src/debug/dev-lang/python-3.7.5-r1/Python-3.7.5/Objects/dictobject.c:1327
#6 0x00007ffff69faab7 in hv_cli_and_fast_memoized_kind (self=0x7ffff5dde7d0, kind=(<type at remote 0x7ffff7d5e960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dce290>)) at /home/zhuyifei1999/guppy3/src/heapy/hv_cli_and.c:42
#7 0x00007ffff69fae83 in hv_cli_and_classify (self=0x7ffff5dde7d0, obj=[]) at /home/zhuyifei1999/guppy3/src/heapy/hv_cli_and.c:117
#8 0x00007ffff6a01320 in cli_epartition_iter (obj=[], ta=0x7fffffffbce0) at /home/zhuyifei1999/guppy3/src/heapy/classifier.c:146
#9 0x00007ffff6a296ba in NyNodeSet_iterate (ns=0x555555580100, visit=0x7ffff6a012ea <cli_epartition_iter>, arg=0x7fffffffbce0) at /home/zhuyifei1999/guppy3/src/sets/nodeset.c:508
#10 0x00007ffff69f8b09 in NyNodeSet_iterate (ns=0x555555580100, visit=0x7ffff6a012ea <cli_epartition_iter>, arg=0x7fffffffbce0) at /home/zhuyifei1999/guppy3/src/heapy/impsets.c:44
#11 0x00007ffff69f8c3b in iterable_iterate (v=<guppy.sets.setsc.ImmNodeSet at remote 0x555555580100>, visit=0x7ffff6a012ea <cli_epartition_iter>, arg=0x7fffffffbce0) at /home/zhuyifei1999/guppy3/src/heapy/heapyc.c:93
#12 0x00007ffff6a01429 in cli_epartition (self=0x7ffff5dc6b90, iterable=<guppy.sets.setsc.ImmNodeSet at remote 0x555555580100>) at /home/zhuyifei1999/guppy3/src/heapy/classifier.c:166
[...]
(gdb) py-bt
Traceback (most recent call first):
<built-in method epartition of guppy.heapy.heapyc.ObjectClassifier object at remote 0x7ffff5dc6b90>
File "/home/zhuyifei1999/guppy3/guppy/heapy/Classifiers.py", line 35, in call_with_referrers
return f(x)
File "/home/zhuyifei1999/guppy3/guppy/heapy/Classifiers.py", line 114, in partition_cli
self.cli.epartition)
File "/home/zhuyifei1999/guppy3/guppy/heapy/Classifiers.py", line 105, in partition
for k, v in self.partition_cli(iterable):
File "/home/zhuyifei1999/guppy3/guppy/heapy/Part.py", line 709, in __init__
for (kind, part) in classifier.partition(set.nodes)]
File "/home/zhuyifei1999/guppy3/guppy/heapy/Part.py", line 800, in partition
return SetPartition(self, set, er)
File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 1692, in get_partition
p = a.fam.Part.partition(a, a.er)
File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 1603, in c_str
a.fam.get_partition(a).ppob(ob)
File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 344, in __str__
return self.fam.c_str(self)
<built-in method print of module object at remote 0x7ffff6c6fd10>
File "test3.py", line 6, in <module>
print(h.idset(al).by(h.Id & h.Size) & h.idset(a0[:100]).by(h.Id & h.Size).kind)
Ok, a broken richcompare that is actively used but doesn't break the entire system... is interesting. I'm looking into it.
Ah, I see. The hash "collision" is because the tuple values are the same, and our hash function has been good enough to avoid a collision when the tuple values are not the same.
The bug was a false positive during a richcompare when the tuple are not the same, but in this case, richcompare would not be called if the hash values are different. There wouldn't be a way to trigger the bug unless we can generate a hash collision for different tuple values.
Why is it getting the same arguments each time... probably because we aren't leaking memory and the tuple is recreated and destroyed again and again in hv_cli_and_classify
.
Btw, the object that is being classified is []
and the classify result of the and classifier is (<type at remote 0x7ffff7d5e960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dce290>)
. What sort of classifier is that? Isn't it supposed to be ([], 72)
for an id & size classifier?
Btw, the object that is being classified is
[]
and the classify result of the and classifier is(<type at remote 0x7ffff7d5e960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dce290>)
. What sort of classifier is that? Isn't it supposed to be([], 72)
for an id & size classifier?
Answer to self: Clodo, this is classifying during print so the set's printing classifer is still clodo.
(gdb) up
#1 0x00007ffff7c189f2 in do_richcompare (op=2, w=(<type at remote 0x7ffff7d5a960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dc5290>), v=(<type at remote 0x7ffff7d5a960>, <guppy.sets.setsc.ImmNodeSet at remote 0x7ffff5dc5290>)) at /usr/src/debug/dev-lang/python-3.7.5-r1/Python-3.7.5/Objects/object.c:695
695 res = (*f)(v, w, op);
[...]
(gdb)
#7 0x00007ffff69f1e83 in hv_cli_and_classify (self=0x7ffff5dd5640, obj=[]) at /home/zhuyifei1999/guppy3/src/heapy/hv_cli_and.c:117
117 result = hv_cli_and_fast_memoized_kind(self, kind);
(gdb) p ((NyObjectClassifierObject *)((PyTupleObject *)classifiers)->ob_item[0])->def->classify
$1 = (PyObject *(*)(PyObject *, PyObject *)) 0x7ffff69f513b <hv_cli_type_classify>
(gdb) p ((NyObjectClassifierObject *)((PyTupleObject *)classifiers)->ob_item[1])->def->classify
$2 = (PyObject *(*)(PyObject *, PyObject *)) 0x7ffff69f2a28 <hv_cli_dictof_classify>
class AndClassifier(Classifier):
[...]
def get_kind(self, k):
ks = []
for ki, ci in zip(k, self.args):
ks.append(ci.get_kind(ki))
return self.mod.UniSet.fam_And._cons(ks)
class Classifier:
[...]
def partition(self, iterable):
items = []
for k, v in self.partition_cli(iterable):
k = self.get_kind(k)
v = self.mod.Use.idset(v, er=self.er)
items.append((k, v))
return items
def partition_cli(self, a):
ep = self.call_with_referrers(
a,
self.cli.epartition)
return [(k, ep[k]) for k in ep.get_domain()]
Yeah, I see the NodeTuple is iterated through and used to create a list, so I don't think at the high level there is any use of NodeTuple, after the classification. During classification the NodeTuple is only used as keys to the dict, which would be somewhat hard to make a hash collision to hit this bug.
I guess a test at the low level is the way to go then. (Will do later)