sissaschool/elementpath

Issues with count() and possibly other methods

RabbitJackTrade opened this issue · 3 comments

Assume this xml:

sizes = """<?xml version="1.0" encoding="utf-8"?> <pages> <page> <box> <line> <text size="12.482">C</text> <text size="12.333">A</text> <text size="12.333">P</text> <text size="12.333">I</text> <text size="12.482">T</text> <text size="12.482">O</text> <text size="12.482">L</text> <text size="12.482">O</text> <text></text> <text size="12.482">I</text> <text size="12.482">I</text> <text size="12.482">I</text> <text></text> </line> </box> </page> </pages> """

And this code:
from lxml import etree
import elementpath
content = sizes.encode('utf-8')
root = etree.XML(content)

Say I want to count using the following xpath expression:

expres = 'count(//text[@size="12.482"][not(preceding-sibling::text[1][@size="12.482"])])'

The right answer is '3'. It can be verified by eyeballing the xml, or online here.

But when I compare etree to elemenpath I get this:

et_cnt = root.xpath(expres)
ep_cnt = elementpath.select(root,expres)
print('etree count =',int(et_cnt))
print('elementpath count =',ep_cnt)

With the output being:
etree count = 3
elementpath count = 1

Am I doing something wrong or is it a bug?
Thanks.

Hi,
seems to be a bug in application of a sequence of predicate. I'll use your test to produce a fix.
Thanks

Fix for this is available in release v1.4.4.
Cheers

Works like a charm! Thanks for your great work.