lucidsoftware/xtract

The parser doesn't parse completely the xml tree

Closed this issue · 2 comments

Vesli commented

Hello,

I am trying your library and I am facing a difficulty when parsing xml.

build.sbt:

scalaVersion := "2.12.8"

libraryDependencies ++= Seq(
  "com.lucidchart" %% "xtract" % "2.0.0",
)

I have this xml generated from a sport API:

<Export>
    <GeneralInformation>
        <Date>2019-05-17</Date>
        <Time>14:21:30</Time>
        <Timestamp>1558102890303</Timestamp>
    </GeneralInformation>
    <Sport code="SOC" id="1" name="SOCCER" hid="13383104" lineup_hid="785967" commentary_hid="1335638">
        <Matchday date="2019-04-14">
            <Match ct="-2" id="1597322" lastPeriod="2 HF" leagueCode="44337" leagueSort="11" leagueType="LEAGUE" startTime="10:00" status="Fin" statustype="fin" type="2" visible="1" lineups="1">
                ...
            </Match>
            <Match ct="-6" id="1599008" lastPeriod="2 HF" leagueCode="44415" leagueSort="11" leagueType="LEAGUE" startTime="10:30" status="Fin" statustype="fin" type="2" visible="1" lineups="1">
                ...
            </Match>
        </Matchday>
    </Sport>
    <Sport code="BSK" id="3" name="BASKETBALL" hid="10716464" lineup_hid="101838" commentary_hid="0">
        <Matchday date="2019-04-14">
            <Match ct="0" id="5181854" lastPeriod="4Qrt" leagueCode="190925" leagueSort="1" leagueType="LEAGUE" startTime="00:00" status="Fin" statustype="fin" type="2" visible="1" lineups="1">
                ...
            </Match>
            <Match ct="0" id="5181855" lastPeriod="4Qrt" leagueCode="190925" leagueSort="1" leagueType="LEAGUE" startTime="02:30" status="Fin" statustype="fin" type="2" visible="1" lineups="1">
                ...
            </Match>
        </Matchday>
    </Sport>
</Export>

And following the example given I have this structured data:

import com.lucidchart.open.xtract.XmlReader._
import com.lucidchart.open.xtract.{XmlReader, __}
import cats.syntax.all._


case class Sport(
                  code: String,
                  id: String,
                  name: String,
                  hid: String,
                  lineup_hid: String,
                  Matchday: Seq[Matchday]
                )
object Sport {
  implicit val reader: XmlReader[Sport] = (
    attribute[String]("code"),
      attribute[String]("id"),
      attribute[String]("name"),
      attribute[String]("hid"),
      attribute[String]("lineup_hid"),
      (__ \ "Matchday").read(seq[Matchday])
  ).mapN(apply _)
}

case class Matchday (
                      date: String,
                      Matchs: Seq[Match]
                    )
object Matchday {
  implicit val reader: XmlReader[Matchday] = (
    attribute[String]("date"),
      (__ \ "Match").read(seq[Match])
  ).mapN(apply _)
}

case class Match (
                   ct: String,
                   id: String,
                   lastPeriod: String,
                   leagueCode: String,
                   leagueSort: String,
                   leagueType: String,
                   startTime: String,
                   status: String,
                   statusType: String,
                   Type: String,
                   visible: String,
                   lineups: String
                 )
object Match {
  implicit val reader: XmlReader[Match] = (
    attribute[String]("ct"),
      attribute[String]("id"),
      attribute[String]("lastPeriod"),
      attribute[String]("leagueCode"),
      attribute[String]("leagueSort"),
      attribute[String]("leagueType"),
      attribute[String]("startTime"),
      attribute[String]("status"),
      attribute[String]("statusType"),
      attribute[String]("type"),
      attribute[String]("visible"),
      attribute[String]("lineups")
  ).mapN(apply _)
}

case class Export (
                    Sport: Seq[Sport]
                  )
object Export {
  implicit val reader: XmlReader[Export] = (__ \ "Sport").read(seq[Sport]).default(Nil).map(apply _)
}

The XmlHelper is the same as in the github example:

import java.io.File
import com.lucidchart.open.xtract.XmlReader
import scala.io.Source
import scala.xml.XML
/**
  * This class provide functionality to parse xml data into scala case classes
  */
trait XmlHelper {
  def xtract(filePath: String): Option[Export] = {
    val xmlData = Source.fromFile(new File(filePath)).getLines().mkString("\n")
    println("***File to be parsed: ")
    println(xmlData)
    val xml = XML.loadString(xmlData)
    XmlReader.of[Export].read(xml).toOption
  }
}

and my main app:

object MainApp extends App with XmlHelper {
    println("One")
    val path = "src/main/resources/scorespro.xml"
    println("Two")
    val response = xtract(path)
    println("***RESPONSE: " + response)
}

But when I try my code, I have this:

***RESPONSE: Some(Export(Vector(Sport(SOC,1,SOCCER,13383104,785967,Vector(Matchday(2019-04-14,Vector()))), Sport(BSK,3,BASKETBALL,10716464,101838,Vector(Matchday(2019-04-14,Vector()))))))

Whatever I try - changing the Match attribute or using lazyRead - it doesn't want to process the tags.

Any chance I am doing something completely wrong here?

It looks like in the XML the match has an attribute called "statustype" but your code is looking for the attribute "statusType". So, it fails to parse the match element to Match (and seq is somewhat lenient). If you had used strictReadSeq instead of seq, it probably would have given you an error.

I'm going to close this, since I have answered the question. If you still have an issue, please reopen.