pabigot/pyxb

XML without namespace not working. When/How to use default_namespace.

Closed this issue · 9 comments

Hi!

I have an XSD that defines a namespace:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"
  targetNamespace="http://opennebula.org/XMLSchema" xmlns="http://opennebula.org/XMLSchema">
  <xs:element name="MARKETPLACE">
  ....
  </xs:element>
</xs:schema>

However when making XML-RPC calls the namespace is omitted, and responses are typically

<MARKETPLACE>
....
</MARKETPLACE>

Instead of

<MARKETPLACE xmlns='http://opennebula.org/XMLSchema'>
....
</MARKETPLACE>

I don't know how to get PyXB to work without somehow "patching" the namespace in the xml String before instantiating the binding.

By looking at the API I wishfully tried to do:

marketplace = bindings.CreateFromDocument(xml, default_namespace="http://opennebula.org/XMLSchema")

I cannot find information about how to use default_namespace or what is it for.

Is there anyway I can tell PyXB to assume that the XML is under a given name-space?

Is that a valid use case for the default_namespace parameter?

Assuming bindings is a module created by generating the bindings for the http://opennebula.org/XMLSchema namespace, then invoking bindings.CreateFromDocument() will automatically set the default namespace to http://opennebula.org/XMLSchema. I don't know why that wouldn't just work.

If bindings is for some other namespace, then you need to know that the value of the default_namespace parameter must be an instance of PyXB Namespace, not a string namespace URI. So if you import the bindings module for http://opennebula.org/XMLSchema as marketschema, you could override the default by doing:

marketplace = bindings.CreateFromDocument(xml, default_namespace=marketschema.Namespace)

I see.
I think the problem might be with the XSD files I am using.
You can have a look here:
https://github.com/OpenNebula/addon-pyone/blob/master/Makefile
I was surprise to see that there are individual XSD files for each datatype possible, but I was kind of missing a "root" XSD that includes all of them as: a valid message is one of the following...
Maybe that is the reason that PyXB cannot determine an automatic default namespace, even thou all the XSD files have a common namespace.
What do you think?

I'm surprised that even works, because it was never intended that you be able to provide multiple schema files with one -u parameter. You should have one -m parameter for each -u parameter, as bindings are supposed to correspond to a single XSD file carrying the content for a single namespace, as with this example.

So yes, create an index.xsd for the target namespace that xs:includes all the component files, and generate your bindings from that. See if that works any better.

If not, attach a document that doesn't get recognized and show me how to reproduce the error.

maybe the reason why it works is because all the provided XSD files are on the same namespace.

I have created the index.xsd which looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"
  targetNamespace="http://opennebula.org/XMLSchema" xmlns="http://opennebula.org/XMLSchema">
    <xs:include schemaLocation="acct.xsd"/>
    <xs:include schemaLocation="cluster_pool.xsd"/>
    <xs:include schemaLocation="cluster.xsd"/>
    <xs:include schemaLocation="datastore_pool.xsd"/>
    <xs:include schemaLocation="datastore.xsd"/>
    <xs:include schemaLocation="group_pool.xsd"/>
    <xs:include schemaLocation="group.xsd"/>
    <xs:include schemaLocation="host_pool.xsd"/>
    <xs:include schemaLocation="host.xsd"/>
    <xs:include schemaLocation="image_pool.xsd"/>
    <xs:include schemaLocation="image.xsd"/>
    <xs:include schemaLocation="marketplaceapp_pool.xsd"/>
    <xs:include schemaLocation="marketplaceapp.xsd"/>
    <xs:include schemaLocation="marketplace_pool.xsd"/>
    <xs:include schemaLocation="marketplace.xsd"/>
    <xs:include schemaLocation="user_pool.xsd"/>
    <xs:include schemaLocation="user.xsd"/>
    <xs:include schemaLocation="vdc_pool.xsd"/>
    <xs:include schemaLocation="vdc.xsd"/>
    <xs:include schemaLocation="vm_pool.xsd"/>
    <xs:include schemaLocation="vmtemplate_pool.xsd"/>
    <xs:include schemaLocation="vmtemplate.xsd"/>
    <xs:include schemaLocation="vm.xsd"/>
    <xs:include schemaLocation="vnet_pool.xsd"/>
    <xs:include schemaLocation="vnet.xsd"/>
    <xs:include schemaLocation="vrouter_pool.xsd"/>
    <xs:include schemaLocation="vrouter.xsd"/>
</xs:schema>

I can now generate the module doing this:

/usr/bin/python  /usr/bin/pyxbgen  -m pyone.bindings.__init__ -u pyone/xsd/index.xsd

which seems much nicer and appropriate.

but unfortunately the problem still persists, if the incoming XML does not include the namespace, it will fail to instantiate the binding, the exception is:

UnrecognizedDOMRootNodeError: <pyxb.utils.saxdom.Element object at 0x7f9c17422510>

I noticed that this does not work either:

marketpool = bindings.CreateFromDocument(xmlSample, default_namespace = bindings.Namespace)

Put xmlSample somewhere I can see it and reproduce the problem, or there's nothing I can do.

Sure. I was trying to create a test to reproduce the issue:

I have created a test in this branch:

https://github.com/rvalle/pyxb/blob/issue-0094/tests/trac/test-issue-0094.py

The issue happens without doing the import of the multiple XSDs.

I'm afraid this is a situation where PyXB's focus on valid XML produces an unsatisfactory answer.

PyXB has never supported a default namespace that isn't explicitly identified in the XML. What it does is handle a special case where the XML uses material defined in a schema that has no namespace, and six years ago when I added support for that I erroneously used "default namespace" in the API for that special case.

Generally schema in XML RPC do not have a target namespace, and in that situation PyXB's special-case handling makes things work. But the XML RPC documents you're working with are not valid standalone, and for whatever reason the server producing them isn't identifying the associated namespace.

PyXB relies on the parser to respect xmlns directives and provide the correct namespace name in the expanded names it produces, but the xml.sax parser doesn't provide a way to set an initial default namespace that isn't present in the XML. I'm not willing to override the expanded names it gives in a non-absent namespace situation because of the risk of accepting documents that should be diagnosed as invalid.

For more information see the comment in the commit that closes this issue.

Sorry. The workaround would be take the response from the server and insert an xmlns="http://opennebula.org/XMLSchema" attribute in the root element.

ok, thanks!