Problem when parsing XML with namespaced elements deeper than level 2
Closed this issue · 2 comments
Hi,
First of all thank you for a great software!
And then I experience some problem with parsing XML document with elements with namespace.
So I have that code and models (`test.py'):
import pathlib
import sys
from pydantic_xml import BaseXmlModel
class SomethingElse3(BaseXmlModel, tag='SomethingElse3'):
SomethingElse4: str
class SomethingElse2(BaseXmlModel, tag='SomethingElse2'):
SomethingElse3: SomethingElse3
class Something(
BaseXmlModel,
tag="SomethingElse",
ns="something",
nsmap={"something": "urn:something:something:v1"},
):
SomethingElse2: SomethingElse2
def main(input_file):
xml_doc = pathlib.Path(input_file).read_text()
something = Something.from_xml(xml_doc)
print(something.to_xml())
if __name__ == "__main__":
main(sys.argv[1])
And I have 2 XML files.
input_ok.xml
<?xml version="1.0" encoding="UTF-8"?>
<something:SomethingElse
xmlns:something="urn:something:something:v1"
>
<something:SomethingElse2>
<SomethingElse3>
ABC
</SomethingElse3>
</something:SomethingElse2>
</something:SomethingElse>
and input_error_xml
<?xml version="1.0" encoding="UTF-8"?>
<something:SomethingElse
xmlns:something="urn:something:something:v1"
>
<something:SomethingElse2>
<something:SomethingElse3>
ABC
</something:SomethingElse3>
</something:SomethingElse2>
</something:SomethingElse>
The only difference between those is that the SomethingElse3
element is namespaces in input_error_xml
Executing python test.py input_ok.xml
works ok and gives me output:
b'<something:SomethingElse xmlns:something="urn:something:something:v1"><something:SomethingElse2><SomethingElse3>\n ABC\n </SomethingElse3></something:SomethingElse2></something:SomethingElse>'
But executing python test.py input_error.xml
gives and error below:
Traceback (most recent call last):
File ".../test.py", line 31, in <module>
main(sys.argv[1])
File ".../test.py", line 26, in main
something = Something.from_xml(xml_doc)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../venv/lib/python3.11/site-packages/pydantic_xml/model.py", line 402, in from_xml
return cls.from_xml_tree(etree.fromstring(source), context=context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../venv/lib/python3.11/site-packages/pydantic_xml/model.py", line 379, in from_xml_tree
ModelT, cls.__xml_serializer__.deserialize(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../venv/lib/python3.11/site-packages/pydantic_xml/serializers/factories/model.py", line 201, in deserialize
raise utils.build_validation_error(title=self._model.__name__, errors_map=field_errors)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Something
SomethingElse2.SomethingElse3
[line -1]: Field required [type=missing, input_value={}, input_type=dict]
So if I introduce any element fro level 3 with namespace the error will arise.
I tried few different ways of defining models and model fields, with namespaces and not. But nothing helped.
Maybe someone is able to figure out is that something wrong with how I define models or is there a bug.
===
Also, the same behaviour works other way, i.e. when generating an XML from model.
test2.py
from pydantic_xml import BaseXmlModel
class SomethingElse3(BaseXmlModel, tag='SomethingElse3'):
SomethingElse4: str
class SomethingElse2(BaseXmlModel, tag='SomethingElse2'):
SomethingElse3: SomethingElse3
class Something(
BaseXmlModel,
tag="SomethingElse",
ns="something",
nsmap={"something": "urn:something:something:v1"},
):
SomethingElse2: SomethingElse2
class Container(BaseXmlModel):
something: Something
something = Something(
SomethingElse2=SomethingElse2(
SomethingElse3=SomethingElse3(
SomethingElse4="ABC"
)
)
)
print(something.to_xml())
Executing python test2.py
gives,. where SomethingElse3
is also without namespace:
b'<something:SomethingElse xmlns:something="urn:something:something:v1"><something:SomethingElse2><SomethingElse3>ABC</SomethingElse3></something:SomethingElse2></something:SomethingElse>'
@samholvi Hi,
Namespaces and namespace maps are not inherited by submodels. In your case SomethingElse3
doesn't have any namespace, it must be declared explicitly for each model:
from pydantic_xml import BaseXmlModel
NSMAP = {
"something": "urn:something:something:v1",
}
class SomethingElse3(BaseXmlModel, tag='SomethingElse3', ns="something"):
SomethingElse4: str
class SomethingElse2(BaseXmlModel, tag='SomethingElse2', ns="something", nsmap=NSMAP):
SomethingElse3: SomethingElse3
class Something(BaseXmlModel, tag="SomethingElse", ns="something", nsmap=NSMAP):
SomethingElse2: SomethingElse2
something = Something(
SomethingElse2=SomethingElse2(
SomethingElse3=SomethingElse3(
SomethingElse4="ABC"
)
)
)
print(something.to_xml(pretty_print=True).decode())
<something:SomethingElse xmlns:something="urn:something:something:v1">
<something:SomethingElse2>
<something:SomethingElse3>ABC</something:SomethingElse3>
</something:SomethingElse2>
</something:SomethingElse>
Hi! Thank you, that works!
I definitely tried many combinations of parameters, but didn't think it could be missing nsmap
as it works for second level elements without specifying nsmap
but not for 3-rd and following.
Also what maybe confused me is that part of doc "Xml default namespace is a namespace that is applied to the element and all its sub-elements without explicit definition." from https://pydantic-xml.readthedocs.io/en/latest/pages/misc.html#default-namespace.
Made me think the namespace will be inherited.
But the issue is not an issue anymore then. Thank you.