PyQuery objects returned by items() have problems
Closed this issue · 2 comments
ezk84 commented
Given a = pdf.pq('LTTextLineHorizontal').items().next()
a.find(':in_bbox("x0,y0,x1,y1")')
raises anExpressionError: The pseudo-class :in_bbox() is unknown
a.parent('LTPage')
returns an empty list, even thougha.parents().filter(lambda i, a: a.tag == 'LTPage')
returns the expected parent (assume here that the LTPage is the direct parent of the element matched bya
).
These two calls would have succeeded had a
not been a result of the items
iterator, like a = pdf.pq('LTTextLineHorizontal[index="13"]')
johnsonc commented
I'm having some issues on similar lines.. it seems the pyquery interface works erratically sometimes... I'll try to get a reproducible error..
jcushman commented
These both currently work for me:
In [54]: next(pdf.pq('LTTextLineHorizontal').items()).find(':in_bbox("0,0,10000,10000")')
Out[54]: [<LTTextBoxHorizontal>]
In [61]: next(pdf.pq('LTTextLineHorizontal').items()).parent()
Out[61]: [<LTRect>]
In [62]: next(pdf.pq('LTTextLineHorizontal').items()).parent('LTRect')
Out[62]: [<LTRect>]
Feel free to reopen if you can reproduce your error.