/scala-xml-codec

An XML Codec Library for Scala

Primary LanguageScalaApache License 2.0Apache-2.0

Scala XML Codec

XML validation and binding library.

Features

  • Validate XML structure
  • Assertions on nodes and node collections
  • Decode XML
    • As nested HLists (default)
    • As arbitrary data types by using custom decoders
    • Using effects (decoders can target an arbitrary monad)
  • Encode XML
  • Modular schemas

Limitations

  • No support for namespaces
  • No support for validating the order of nodes
  • No support for individual text nodes (only Elem.text is supported)

Examples

Usage

Schema builder API

Schema builder API

Cardinalities

Supported cardinalities for node collections:

…              → A
optional(…)    → Option[A]
oneOrMore(…)   → NonEmptyList[A]
zeroOrMore(…)  → List[A]

Elements

elem1("name", child)
elem2("name", child1, child2)
…

Attributes, text and child elements are declared as children:

elem3("name",
  attr(…),
  elem1("child", …),
  text
)

Attributes

attribute("name")

Mandatory/optional attributes:

attr("name")            → String
optional(attr("name"))  → Option[String]

Text

There are two flavours of handling text:

By combining nonEmptyText with the one and optional cardinalities. In this case it is assured that no empty text values are emitted:

nonEmptyText            → String
optional(nonEmptyText)  → Option[String]

By using text directly. In this case an empty text value are emitted if the parent element doesn't contain any text:

text                    → String

The schema nonEmptyText is equivalent to text.ensure(nonEmpty).

Assertions

Assertions can be made using ensure:

attr("name").ensure(nonEmpty)
attr("name").ensure(mustEqual("Sam"))

Assertions can be applied to collections:

  oneOrMore(elem1("employee",
    zeroOrMore(
      elem1("tool", text)
    ).ensure(check(_.size < 3, _ => "Only 2 tools allowed"))
  ))

Custom assertions

Custom assertions can be implemented in this fashion. It is possible to use an effect F; see section on codecs for details.

def nonEmpty[F[_]:Applicative]: Ensure[F, String] =
  _.isEmpty.option("String must not be empty").point[F]

Modes (encode, decode, codec)

When using the schema builder DSL, you have to decide what you want to use your schema for: encoding, decoding, or both. This is done by importing the corresponding variant of the DSL:

import Dsl.simple.decode._
import Dsl.simple.encode._
import Dsl.simple.codec._

For decode/encode schemas, only decoders/encoders for the involved types need to be provided. For codec schemas, both decoders and encoders are required.

This approach allows the compiler to show an error when a decoder/encoder is missing in a specific location. The alternative would be using mode-agnostic schemas and assembling the decoder/encoder implicitly when it is required, but this would lead to incomprehensible implicit-not-found error messages.

Decoding XML

In this example we use the simple schema which targets the cats.Id monad:

// Get access to the schema builder API
import Dsl.simple.codec._

final case class Foo(a: String, b: Option[String], bars: List[Bar])
final case class Bar(c: String, d: Option[String])

val schema =
  elem3("foo",
    attr("a"),
    optional(attr("b")),
    zeroOrMore(elem2("bar",
      attr("c"),
      optional(attr("d"))
    ))
  )

val result: NonEmptyList[String] \/ Foo =
  schema.decode(<foo>…</foo>)
    .map {
      case a :: b :: barElems :: HNil =>
        val bars = barElems map {
          case c :: d :: HNil => Bar(c, d)
        }
        Foo(a, b, bars)
    }

Composing schemas

Schemas can be composed by including other schemas. A codec schema can be included in decoder and encoder schemas, but not the other way around.

import Dsl.simple.codec._

final case class Foo(a: String, b: Option[String], bars: List[Bar])
final case class Bar(c: String, d: Option[String])

val barElem =
  elem2("bar",
    attr("c"),
    optional(attr("d"))
  ).as[Bar]

val fooElem =
  elem3("foo",
    attr("a"),
    optional(attr("b")),
    zeroOrMore(barElem) // Include other schema
  ).as[Foo]

val result: NonEmptyList[String] \/ Foo = fooElem.decode(<foo></foo>)

Codecs

Text nodes, attributes, elements and collections thereof can be decoded/encoded to/from custom types. Any error messages generated during decoding are prepended with the XML path, which makes it easier to locate the source of the error.

Usage

Pass a codec implicitly:

attr("name").as[A]

Pass a codec explicitly:

attr("name") ~ codecForA

Codecs can be chained:

attr("foo")
  .as[LocalDate]                  // decode to LocalDate
  .as[LocalDate @@ StartDate]     // apply tag

Codecs for generic representations

If the target type of a schema is a HList that is the generic representation of a certain type (e.g. case class) A, a codec for A is provided out of the box, courtesy of shapeless:

final case class Bar(c: String, d: Option[String])

val barElem = elem2("bar", attr("c"), optional(attr("d"))).as[Bar]

Implementing custom codecs

A codec targets an effect monad F, which makes it usable only in schemas supporting this effect type.

Implementing a codec for arbitrary effects

implicit def myCodec[F[_]]: Codec[F, String, Foo] =
  Codec.from(
    Decoder.fromDisjunction(x => if (canDecode(x)) \/-(…) else -\/(NonEmptyList("An error occurred"))),
    Encoder.fromFunction(_.unwrap)
  )

Implementing a codec targeting a specific effect

Example for a decoder targeting a cats.data.Reader:

type EnvReader[A] = Reader[Env, A]

def findVideoFormat(env: Env, name: String): Option[VideoFormat] = ???
def getVideoFormatName(env: Env, videoFormat: VideoFormat): String = ???

implicit def videoFormatCodec: Codec[EnvReader, String, VideoFormat] =
  Codec.from(
    Decoder.fromEitherT(name => EitherT(Reader { env =>
      val videoFormat: Option[VideoFormat] = findVideoFormat(env, name)
      videoFormat.\/>(s"Video format $name not found")
    })),
    Encoder.fromEffect(f => Reader(env => getVideoFormatName(env, f)))
  )

Usage:

attr("name").as[VideoFormat]

Decoding XML targeting the EnvReader effect monad:

// Create a schema
val mySchema = new ch.srg.xml.Schema[EnvReader]

// Get access to the schema builder API
import mySchema.codec._

val schema = …

schema
  .decode(xml)
  .run(env) // Execute the effect