lorien/selection

Issue with CamelCase attributes

oiwn opened this issue · 2 comments

oiwn commented

I have issue with selecting camelCase attributes using selection.

Let's take this xml file as example:
https://raw.githubusercontent.com/rushter/data-science-blogs/master/data-science.opml

from lxml.etree import fromstring
from selection import XpathSelector

with open('data-science.opml', 'r') as f:
    xml = XpathSelector(fromstring(f.read()))
    print xml.select('//outline').attr_list('text')  # ok
    print xml.select('//outline').attr_list('htmlUrl')  # exception
lib/python2.7/site-packages/selection/backend/lxml.py", line 38, in attr
    raise DataNotFound(u'No such attribute: %s' % key)
weblib.error.DataNotFound: No such attribute: htmlUrl

As i understand it's lxml problem, but unfortunately can't find any information about it on google.

oiwn commented

Upd.

nodes = xml.select('//outline').node_list()
for node in nodes:
    print node.attrib
    print node.get('htmlUrl')

This code work with camelCase attributes as expected.

Any ideas? In one prev. project i had xml files with elements:

            # <ttItem
            #   styleCode="0645di" origStyleCode=""
            #   itemNumber="0645dibk3" colorCode="bk"
            #   sizeCode="3" description="" regPrice="9.6"
            #   salePrice="0.0" saleLabel="" disclaimer="false"
            #   availQty="296" orderLimit="0" orderQty="0"/>

and i simply use lowercase names of attributes and it works well.

elem['code'] = item.select('.').attr('stylecode')

Your OPML file contains one extra outlet tag that is parent for all other outlet tags.
Use this code

sel.select('//outline[@htmlUrl]').attr_list('htmlUrl')