fraillt/bitsery

Easy serialization for POD types

VelocityRa opened this issue · 12 comments

If a struct is simply a POD type ie. in my case a large struct with tons of trivially serializable fields, there shouldn't be a need to specify a serialize function tediously listing all of them (which is also error-prone). I know of brief_syntax but it doesn't really solve the problem.

Is this currently possible?


RPCS3's serialization system supports this very nicely (impl in this file), if you want an example.

Thanks for reading!

Currently it's not possible, and I'm not planning to add this as core functionality.
I'm also not fan of macros as well, but it is possible to achieve this using structured binding with a little bit of template magic using SFINAE or more modern techniques like if constexpr or concepts, but this requires at least C++17.
Maybe I'll try to play this idea in the future and create an extension that is able to serialize/deserialize any POD type.

@fraillt
Can you be a bit more specific?
I don't get it, why would you need structured bindings / template stuff? Why would it not just be a memcpy or whatever using the sizeof the serialized type (like RPCS3)? I mean, the benefit of POD types is that you can just copy the raw bytes.

Initially I thought that you want to deconstruct your object for each field, and serialize each field separately,
But if I understand correctly, you don't care about memory layout and endianess here, you just want memcpy entire struct.
In this case, it's quite trivial, I played a bit and came up with this code:

#include <bitsery/bitsery.h>
#include <bitsery/adapter/buffer.h>
#include <bitsery/traits/vector.h>
#include <vector>

// for implementing extension
#include <bitsery/traits/core/traits.h>

namespace bitsery {

namespace ext {

class TrivialCopy {
public:

  template<typename Ser, typename T, typename Fnc>
  void serialize(Ser &ser, const T &v, Fnc &&fnc) const {
    const void* addr = std::addressof(v);
    ser.adapter().template writeBuffer<1, uint8_t>(static_cast<const uint8_t*>(addr), sizeof(T));
  }

  template<typename Des, typename T, typename Fnc>
  void deserialize(Des &des, T &v, Fnc &&fnc) const {
    void* addr = std::addressof(v);
    des.adapter().template readBuffer<1, uint8_t>(static_cast<uint8_t*>(addr), sizeof(T));
  }

};
}

namespace traits {
template<typename T>
struct ExtensionTraits<ext::TrivialCopy, T> {
  using TValue = T;

  static_assert(std::is_trivially_copyable<T>::value, "Your type must be trivially_copyable");

  static constexpr bool SupportValueOverload = false;
  static constexpr bool SupportObjectOverload = false;
  // use lambda overload with empty lambda, because object overload expects
  // to have `serialize` method for `T`
  static constexpr bool SupportLambdaOverload = true;
};
}

}

struct MyData {
  uint16_t a{};
  uint16_t b{};
  int32_t c{};
  int64_t d{};
};

using Buffer = std::vector<uint8_t>;
using OutputAdapter = bitsery::OutputBufferAdapter<Buffer>;
using InputAdapter = bitsery::InputBufferAdapter<Buffer>;

int
main()
{
  auto data = MyData {
    32,
    8795,
    -786435,
    5849614964464
  };
  MyData res {};
  Buffer buffer {};

  // serialize
  bitsery::Serializer<OutputAdapter> ser{ buffer };
  // we need to pass any lambda here, to use lambda overload for `.ext` method
  // it looks a bit hackish, but solves the problem :)
  ser.ext(data, bitsery::ext::TrivialCopy{}, []() {});
  ser.adapter().flush();

  // deserialize
  bitsery::Deserializer<InputAdapter> des{ buffer.begin(),
                                     ser.adapter().writtenBytesCount() };
  des.ext(res, bitsery::ext::TrivialCopy{}, []() {});

  // verify
  assert(data.a == res.a && data.b == res.b && data.c == res.c && data.d == res.d);
  return 0;
}

This is what I initially though of.
To run it you need C++20 (it's possible to do it with C++17, but would require more boilerplate.
Modify CMakeLists.txt

set(CMAKE_CXX_STANDARD 20)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

And here's the actual code:

#include <bitsery/bitsery.h>
#include <bitsery/adapter/buffer.h>
#include <bitsery/traits/vector.h>
#include <bitsery/brief_syntax.h>
#include <vector>

// this will be converted into any type, that we need
struct any
{
  template <typename T>
  operator T();
};

template <typename Ser, typename T>
void serialize_aggregate(Ser &ser, T& v) {
  static_assert(std::is_aggregate_v<T>, "only aggregate types can be supported");
  if constexpr ( requires { T { any{}, any{}, any{}, any{} }; } ) {
    auto && [a1, a2, a3, a4] = v;
    ser(a1, a2, a3, a4);
  } else if constexpr ( requires { T { any{}, any{}, any{} }; } ) {
    auto && [a1, a2, a3] = v;
    ser(a1, a2, a3);
  } else if constexpr ( requires { T { any{}, any{} }; } ) {
    auto && [a1, a2] = v;
    ser(a1, a2);
  } else if constexpr ( requires { T { any {}}; }) {
    auto && [a1] = v;
    ser(a1);
  } else {
    // since we assert at the top, that it's aggregate, this will trigger all the time
    static_assert(!std::is_aggregate_v<T>, "only supports struct up to 4 fields");
  }
}

enum MyEnum {
  A,
  B,
  C
};

struct MyData {
  uint16_t a{};
  uint16_t b{};
  int32_t c{};
  MyEnum d{};
};

using Buffer = std::vector<uint8_t>;
using OutputAdapter = bitsery::OutputBufferAdapter<Buffer>;
using InputAdapter = bitsery::InputBufferAdapter<Buffer>;

int
main()
{
  auto data = MyData {
    32,
    8795,
    -786435,
    MyEnum::B,
  };
  MyData res {};
  Buffer buffer {};

  // serialize
  bitsery::Serializer<OutputAdapter> ser{ buffer };
  serialize_aggregate(ser, data);
  ser.adapter().flush();

  // deserialize
  bitsery::Deserializer<InputAdapter> des{ buffer.begin(),
                                     ser.adapter().writtenBytesCount() };
  serialize_aggregate(des, res);

  // verify
  assert(data.a == res.a && data.b == res.b && data.c == res.c && data.d == res.d);
  return 0;
}

... and a bit more complete solution :)

#include <bitsery/bitsery.h>
#include <bitsery/adapter/buffer.h>
#include <bitsery/traits/vector.h>
#include <bitsery/brief_syntax.h>
#include <vector>

// this will be used to construct a any field for an aggregate in `requires` clause
struct any
{
  template <typename T>
  operator T();
};

// forward declarations
template <typename Ser, typename T>
void serialize_aggregate(Ser &ser, T& v);

// to stop variadic expansion
template <typename Ser>
void serialize_impl(Ser& ser) {
}

template <typename Ser, typename T, typename ...Ts>
void serialize_impl(Ser& ser, T& v, Ts&& ...rest) {
  if constexpr ( std::is_aggregate_v<T>) {
    serialize_aggregate(ser, v);
  } else {
    ser(v);
  }
  serialize_impl(ser, std::forward<Ts>(rest)...);
}

template <typename Ser, typename T>
void serialize_aggregate(Ser &ser, T& v) {
  static_assert(std::is_aggregate_v<T>, "only aggregate types can be supported");
  if constexpr ( requires { T { any{}, any{}, any{}, any{} }; } ) {
    auto && [a1, a2, a3, a4] = v;
    serialize_impl(ser, a1, a2, a3, a4);
  } else if constexpr ( requires { T { any{}, any{}, any{} }; } ) {
    auto && [a1, a2, a3] = v;
    serialize_impl(ser, a1, a2, a3);
  } else if constexpr ( requires { T { any{}, any{} }; } ) {
    auto && [a1, a2] = v;
    serialize_impl(ser, a1, a2);
  } else if constexpr ( requires { T { any {}}; }) {
    auto && [a1] = v;
    serialize_impl(ser, a1);
  } else {
    // since we assert at the top, that it's aggregate, this will trigger all the time
    static_assert(!std::is_aggregate_v<T>, "only supports struct up to 4 fields");
  }
}

struct Xxx {
  int x;
  int32_t z;
};

struct MyData {
  uint16_t a{};
  uint16_t b{};
  int32_t c[2];
  Xxx d{};
};

using Buffer = std::vector<uint8_t>;
using OutputAdapter = bitsery::OutputBufferAdapter<Buffer>;
using InputAdapter = bitsery::InputBufferAdapter<Buffer>;

int
main()
{
  auto data = MyData {
    32,
    8795,
      {-786435, 23423},
    Xxx { 3, 454 },
  };
  MyData res {};
  Buffer buffer {};

  // serialize
  bitsery::Serializer<OutputAdapter> ser{ buffer };
  serialize_aggregate(ser, data);
  ser.adapter().flush();

  // deserialize
  bitsery::Deserializer<InputAdapter> des{ buffer.begin(),
                                     ser.adapter().writtenBytesCount() };
  serialize_aggregate(des, res);

  // verify
  assert(data.a == res.a && data.b == res.b && data.c[1] == res.c[1] && data.d.x == res.d.x);
  return 0;
}

I hope that would help.

I didn’t check but I’m pretty sure it will not work if you remove one of the non array members in MyData due to brace elision support in aggregate initialization. Right now it works because 4 happens to be the max number of members and the structured binding works

Hi, nice to see you here :)
I just tested, and it seams to work, if I remove any of the fields.
I think brace elision has nothing to do here, the real magic is the conversion operator.

  template <typename T>
  operator T();

As I understand T { any{}, ...} means that we first initialize/create any{}, and then it gets converted to any type that is needed for a specific field of T.
However, I agree that this is not complete solution, and rather limited, but I hope that this still might be useful for @VelocityRa :)

Whoa, thanks so much for taking the time to write this!
I'll see if I can use it. MIT licensed right?

Have fun ;)
I was having fun too, and I should mention that @eyalz800 was the one who showed me that this is possible with modern C++ so thanks for him as well :)

@fraillt how does it work for you with the following:

struct MyData {
  uint16_t a{};
  int32_t c[2];
  Xxx d{};
};

In the case above I expect T{any{},any{},any{},any{}} to work and structured binding to fail with 4 members (because there are only 3). Because of the brace elision the middle two “any”s go to the int array.

Yep, you're right.
I tried, various ways, and came up with this idea,

if constexpr ( requires { [](T& v){ auto & [a1, a2, a3, a4] = v;}; } ) {
...
}

it works on Clang, but not on GCC, although I guess it should work...
Why C++ needs to be so complex... :)

What you did with the lambda is exactly on my library read me - github.com/eyalz800/zpp_bits documentation and it’s an area where the standard I think is not fully clear but the intent is more towards to hard error this because the error does not happen in the immediate context and therefore doesn’t count for SFINAE so gcc is probably correct.