Determine root node
Closed this issue · 5 comments
How can I determine the tag name of the root node?
For CPACS I know it's /cpacs
, but what about arbitrary XML documents?
Answer: Use the path "/"
. When combining/concatenating paths, however, this requires a distinction between /
+ childName
vs. /cpacs
+ /
+ childName
.
I created a convenience wrapper in Python that accepts the empty path instead and makes above distinction by using either /
or the empty string as a concatenation string.
I did not completely understand, what the issue is / was.
Could you please provide an example, what you tried to achieve, what did not work and what works?
For the XML document
<?xml version="1.0" encoding="UTF-8"?>
<root>
<x><a>1.2</a><b>34</b></x>
</root>
How do I determine the root node name /root
? I think usually you simply need to know the name of the root node to be able to work with the file, but as described above it is possible to determine its name with TIXI (with one caveat explained above).
but as described above it is possible to determine its name with TIXI (with one caveat explained above).
But I did not get the caveat, this is what i meant. What is the workaround you are using? I am just trying to understand, whether we could improve something in tixi.
So the path I provide the function is something like /cpacs/toolspecific
, starting with a slash and ending with a tag name (or a predicate like [@uID="1"]
). To get the root node, however, I cannot use the empty string
, but I have to use /
which arguably has a trailing slash - in contrast to all other paths I might query.
That means I cannot recursively go through the returned nodes and compose them by parent + "/" + childName
, because then I would get //cpacs/toolspecific
. That`s the reason why I had to make a distinction at the root level. I've come about this problem over the last years in many projects, just noticed this today when creating a list of XPaths for all numeric values in an XML document (using tixi).
def enumerateNumericXpaths(xml, path, paths):
r''' Return a list of all XPaths that point to a number.
xml: open Tixi document
path: current path prefix (including index)
paths: list of XPaths that contain a numeral
>>> t, r = Tixi(), []
>>> t.openString('<?xml version="1.0" encoding="UTF-8"?>\n<root><x><a>1.2</a><b>34</b></x></root>')
>>> paths = enumerateNumericXpaths(t, '', r); print("\n".join(r))
/root[1]/x[1]/a[1]
/root[1]/x[1]/b[1]
'''
if path == '': path, sep = '/', ''
else: sep = '/'
children = xml.getNumberOfChilds(path)
counts = collections.defaultdict(int) # map from element name -> current per-element index
for i in range(1, children + 1):
childName = xml.getChildNodeName(path, i)
if childName == '#text':
try: xml.getDoubleElement(path); paths.append(path) # xml.getIntegerElement(path) works as well for int and double
except: pass
continue
elif childName.startswith('#'): continue # comment or CDATA - ignored for now
counts[childName] += 1
enumerateNumericXpaths(xml, path + '%s%s[%d]' % (sep, childName, counts[childName]), paths) # recurse into children
return paths # by reference, I hope!