Issue with CamelCase attributes
oiwn opened this issue · 2 comments
oiwn commented
I have issue with selecting camelCase attributes using selection.
Let's take this xml file as example:
https://raw.githubusercontent.com/rushter/data-science-blogs/master/data-science.opml
from lxml.etree import fromstring
from selection import XpathSelector
with open('data-science.opml', 'r') as f:
xml = XpathSelector(fromstring(f.read()))
print xml.select('//outline').attr_list('text') # ok
print xml.select('//outline').attr_list('htmlUrl') # exception
lib/python2.7/site-packages/selection/backend/lxml.py", line 38, in attr
raise DataNotFound(u'No such attribute: %s' % key)
weblib.error.DataNotFound: No such attribute: htmlUrl
As i understand it's lxml problem, but unfortunately can't find any information about it on google.
oiwn commented
Upd.
nodes = xml.select('//outline').node_list()
for node in nodes:
print node.attrib
print node.get('htmlUrl')
This code work with camelCase attributes as expected.
Any ideas? In one prev. project i had xml files with elements:
# <ttItem
# styleCode="0645di" origStyleCode=""
# itemNumber="0645dibk3" colorCode="bk"
# sizeCode="3" description="" regPrice="9.6"
# salePrice="0.0" saleLabel="" disclaimer="false"
# availQty="296" orderLimit="0" orderQty="0"/>
and i simply use lowercase names of attributes and it works well.
elem['code'] = item.select('.').attr('stylecode')
lorien commented
Your OPML file contains one extra outlet
tag that is parent for all other outlet tags.
Use this code
sel.select('//outline[@htmlUrl]').attr_list('htmlUrl')