lucidsoftware/xtract

How to parse XML containing a recursive sequence of subnodes of different types (with different attributes)?

Closed this issue · 2 comments

I saw the issue #26, but the nodes there had no attributes.
I have something similar with a recursive structure:

<document>
 <container>
   <paramset>
      <paramset>
      <param>
   <container>
...

Each with different attributes. i.e.: A container can contain multiple containers or paramsets,
and a paramset can contain multiple paramsets or params.
I'm having trouble parsing that using xtract. Any tips?

How do you handle the case where you have a sequence of two or more different types of subnodes (with attributes)?

Never mind - I switched to a different XML lib.

I think you could do this with something like:

case class Container(paramSets: Seq[ParamSet], children: Seq[Container])
object Container {
   implicit val reader: XmlReader[Container] = (
    (__ \ "paramset").read(XmlReader.seq[ParamSet],
    (__ \ "container").lazyRead(XmlReader.seq[Container]),
  ).mapN(apply _)
  
case class ParamSet(parms: Seq[Param], children: Seq[ParamSet])
object ParamSet {
  implicit val reader: XmlReader[ParamSet] = (
    (__ \ "paramset").lazyRead(XmlReader.seq[ParamSet]),
    (__ \ "param").read[Param],
  ).mapN(apply _)

This does assume that if you have parms and paramsets interspersed with each other the order doesn't matter (and same with paramsets and containers). If you do need to preserve the order, it is more complicated.