ecorm/cppwamp

Provide DTO-to-Variant conversion facilities

ecorm opened this issue · 3 comments

Provide facilities for converting user DTOs (Data Transfer Objects) to/from Variant objects.

For example, this C++ struct:

struct MyDto
{
    int id;
    std::string name;
    std::vector<float> values;
}

would be converted to the following Variant object (shown in JSON representation):

{
    "id": 42,
    "name": "Slartibartfast",
    "values": [12.34, 56.78]
}

There are two possible approaches:

  1. Manual approach
  2. IDL approach

The "manual" approach makes the user responsible for defining how her objects are serialized. It typically involves providing a serialize function that takes an "archive", and then inserting/extracting each object member to/from the archive. This is the approach used by the Boost.Serialization and Cereal libraries.

The IDL approach involves defining the schema of an object in an Interface Description Language. A special compiler looks at the IDL definition and generates a struct (or class) that is used to store the data. This generated struct "knows" how to serialize/deserialize itself to/from the archive. This is the approach used by Protocol Buffers.

The advantages of the manual approach are:

  • Does not require the user compiling an IDL generator, or obtaining and pre-built copy.
  • The user can convert directly from her DTO to Variant, without involving a generated struct.
  • The user has more control over how their objects are converted. They can handle special cases, such as null values.
  • Implementing support for manual conversion is probably much easier than implementing an IDL compiler.

The advantages of the IDL approach are:

  • The IDL schema serves as a way of documenting DTOs
  • The IDL could potentially be used in other languages

Unless someone convinces me otherwise, I'm going with the "manual" approach for this feature. I'm either going to use the Cereal library (and implement a custom archive), or will implement something from scratch along the same lines as Cereal.

I have experimented with Protocol Buffers before, and I don't like the IDL approach for the following reasons:

  • The application build process is more complicated, as you now have to run the IDL compiler as a pre-build step.
  • Reliance on an IDL compiler script or executable.
  • Writing an IDL schema takes as much effort as writing a serialize function for your serializable object.
  • You're forced to work with the intermediate struct that protobuf provides. You either have to copy values from your domain object to the protobuf struct, or embed the protobuf struct within your domain object. I find that a serialize function is less intrusive than the protobuf struct.

Note that this decision doesn't prevent anyone else from writing an IDL compiler that converts to wamp::Variant, if that's how they prefer to work. An IDL compiler could even make use of the manual conversion infrastructure.

I've taken a closer look at Cereal, and I won't be using it for the following reasons:

  1. It adds yet another dependency to CppWAMP.
  2. It has to contend with various kinds of archives, so it ends up being more heavyweight than what I need it to be.
  3. I would mostly be using it as a "front end" that converts standard library and Boost containers into fundamental types.
  4. The metadata it adds to archives is not well documented, so it's problematic for data interchange when the other peer is not written in C++.
  5. It does not serialize std::map<std::string, T> into normal JSON objects. It instead serializes it into an array of key-value pairs.
  6. It jumbles both positional and keyword values in a way that doesn't harmonize well with WAMP.

That being said, Cereal seems like an excellent library for saving object graphs to files. I especially like its API, and will take inspiration from it when designing my own conversion utilities.

I might have dismissed Cereal too quickly. It turns out that Cereal can do a lot of work for me:

  • Detecting internal vs external serialization functions for user types.
  • Detecting load/save split serialization functions for user types.
  • Handling serialization of base classes.
  • There are 22 different headers in cereal/include/cereal/types/ for handling standard library and Boost types.

The points I raised in the above post can be mitigated as follows:

  1. The added dependency is entirely optional.
  2. The cost of serialization is mainly the construction of a temporary std::stack object. Since a std::map has to be constructed anyway for a new wamp::Object, the extra overhead is not that significant.
  3. That "front end" conversion of standard library and Boost contains involves quite a bit of work, and it would be a shame not to leverage it.
  4. When used in a simplistic way, there is no metadata involved.
  5. It turns out that I can specialize the serialization of std::map<std::string, T> for my DTO archive. I would also specialize std::array<T> so that it's serialized as a normal array.
  6. It looks like I can reject positional values at runtime for my DTO archive.

Another argument for Cereal is that if the user already uses it (say, for file I/O), then they can reuse their same serialization functions for converting their objects to wamp::Object.