Parsing XML to a class: A sequence of alternatives is wronly parsed
DareDevilDenis opened this issue · 2 comments
DareDevilDenis commented
Using:
- xsdata 24.4
- Python 3.11.5
I ran xsdata generate my_schema.xsd
on the following schema:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning" vc:minVersion="1.1">
<xs:element name="RootNode">
<xs:complexType>
<xs:sequence>
<xs:element name="Field" minOccurs="2" maxOccurs="2">
<xs:alternative test="@name='LeafType1'" type="LeafType1" />
<xs:alternative test="@name='LeafType2'" type="LeafType2" />
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="LeafType1">
<xs:simpleContent>
<xs:extension base="xs:unsignedByte">
<xs:attribute name="name" fixed="LeafType1"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:complexType name="LeafType2">
<xs:simpleContent>
<xs:extension base="xs:unsignedByte">
<xs:attribute name="name" fixed="LeafType2"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:schema>
This created my_schema.py (this looks correct):
from dataclasses import dataclass, field
from typing import List, Optional, Union
@dataclass
class LeafType1:
value: Optional[int] = field(
default=None,
metadata={
"required": True,
},
)
name: str = field(
init=False,
default="LeafType1",
metadata={
"type": "Attribute",
},
)
@dataclass
class LeafType2:
value: Optional[int] = field(
default=None,
metadata={
"required": True,
},
)
name: str = field(
init=False,
default="LeafType2",
metadata={
"type": "Attribute",
},
)
@dataclass
class RootNode:
field_value: List[Union[LeafType1, LeafType2]] = field(
default_factory=list,
metadata={
"name": "Field",
"type": "Element",
"min_occurs": 2,
"max_occurs": 2,
},
)
However when I parsed the following input XML:
<RootNode>
<Field name="LeafType1">1</Field>
<Field name="LeafType2">2</Field>
</RootNode>
Using the following code:
import pprint
from pathlib import Path
from xsdata.formats.dataclass.parsers import XmlParser
from xsdata.formats.dataclass.parsers.handlers import XmlEventHandler
from my_schema import RootNode
input_xml_path = Path(__file__).parent / "input.xml"
parser = XmlParser(handler=XmlEventHandler)
deserialized = parser.parse(input_xml_path, RootNode)
pprint.pp(deserialized)
The output is wrong - we get LeafType1 twice:
RootNode(field_value=[LeafType1(value=1, name='LeafType1'),
LeafType1(value=2, name='LeafType1')])
Here are the files: xsdata_issue_1012.zip
tefra commented
Thanks for reporting @DareDevilDenis the fix is on main!
xml = """<RootNode>
<Field name="LeafType1">1</Field>
<Field name="LeafType2">2</Field>
</RootNode>"""
parser = XmlParser()
result = parser.from_string(xml)
print(result)
RootNode(field_value=[LeafType1(value=1, name='LeafType1'), LeafType2(value=2, name='LeafType2')])
DareDevilDenis commented
Thanks @tefra for the very quick fix! 👍