How to parse a tag which can have multiple names in a single property
sdipendra opened this issue · 4 comments
How to parse a tag which can have multiple names.
Specifically for example:
For a tag named "link:Stat"
some of my XML documents have fully qualified name: <link:Stat></link:Stat>
some of my XML documents just have: <Stat></Stat> without the namespace
No document has both formats.
I want them to be mapped to the same single property: val stat: Stat
How can I achieve this? Thanks!
There are two approaches. One is currently broken (I've fixed it): adding a custom handler for unknown content in the policy. The other is to have a filter on the parser that just remaps tags. A final option for your case is to override the mechanism by which the policy maps kotlin types to tag names. This is global, but can allow you to use the same serializer with a different policy to parse either.
For the third approach, I'm trying to override the policy behaviour but I'm unable to identify the method that I should override.
I've created a failing test setup for the same if you can point the policy method that I should override that will be great.
In the current setup the first test case with prefix passes & the second test case without prefix fails.
package com.kodepad.xml
import kotlinx.serialization.Serializable
import kotlinx.serialization.decodeFromString
import nl.adaptivity.xmlutil.ExperimentalXmlUtilApi
import nl.adaptivity.xmlutil.serialization.DefaultXmlSerializationPolicy
import nl.adaptivity.xmlutil.serialization.XML
import nl.adaptivity.xmlutil.serialization.XmlElement
import nl.adaptivity.xmlutil.serialization.XmlSerialName
import nl.adaptivity.xmlutil.serialization.XmlSerializationPolicy
import nl.adaptivity.xmlutil.serialization.XmlValue
import org.junit.jupiter.api.Test
import org.slf4j.LoggerFactory
import kotlin.test.assertEquals
@OptIn(ExperimentalXmlUtilApi::class)
internal class XMLUtilFailingTest {
@Serializable
@XmlSerialName(
namespace = "http://www.kodepad.com/xml/equipment",
prefix = "equipment",
value = "device",
)
data class Device(
@XmlElement(value = true) val stat: Stat?,
)
@Serializable
@XmlSerialName(
namespace = "http://www.kodepad.com/xml/link",
prefix = "link",
value = "Stat",
)
data class Stat(
@XmlValue val value: String,
)
class XmlSerializationPolicyProxy(xmlSerializationPolicy: XmlSerializationPolicy) :
XmlSerializationPolicy by xmlSerializationPolicy {
// todo: Override method to map "Stat" to "link:Stat"
}
companion object {
private val log = LoggerFactory.getLogger(this::class.java.declaringClass.name)
private val expectedValue = Device(Stat("WORKING"))
}
private val xml = XML {
this.policy = XmlSerializationPolicyProxy(
DefaultXmlSerializationPolicy(
false, encodeDefault = XmlSerializationPolicy.XmlEncodeDefault.NEVER
)
)
}
@Test
fun `parse xml with prefix`() {
val xmlString =
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" + "<equipment:device xmlns:equipment=\"http://www.kodepad.com/xml/equipment\"\n" + " xmlns:link=\"http://www.kodepad.com/xml/link\">\n" + " <link:Stat>WORKING</link:Stat>\n" + "</equipment:device>\n"
val device = xml.decodeFromString<Device>(xmlString)
log.info("device: $device")
assertEquals(expectedValue, device)
}
@Test
fun `parse xml without prefix`() {
val xmlString =
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" + "<equipment:device xmlns:equipment=\"http://www.kodepad.com/xml/equipment\"\n" + " xmlns:link=\"http://www.kodepad.com/xml/link\">\n" + " <Stat>WORKING</Stat>\n" + "</equipment:device>\n"
val device = xml.decodeFromString<Device>(xmlString)
log.info("device: $device")
assertEquals(expectedValue, device)
}
}
Included dependencies:
plugins {
kotlin("jvm") version "1.8.20"
kotlin("plugin.serialization") version "1.8.20"
}
dependencies {
// Serialization
implementation("org.jetbrains.kotlinx:kotlinx-serialization-json:1.5.0")
implementation("io.github.pdvrieze.xmlutil:core:0.86.0")
implementation("io.github.pdvrieze.xmlutil:serialization:0.86.0")
}
Unfortunately there is a bug in the handling (now fixed in dev). What should be overridden is handleUnknownContentRecovering
. To see how this works look at:
and:
But please note that this is broken in master (the helper function is new - but more significantly recovery for elements is broken (it fails to read the end tag))
Checked on dev. This works for my use case. Thank you.
One suggestion though instead of having a specific method for handling null namespace wouldn't it better to have a method that provides ability to map a parsed QName to some other QName. That will enable the null namespace and many other use cases as well.