ar90n/msgpack11

Sending complex objects over the network and deserializing them

JoaoAJMatos opened this issue · 6 comments

I'm currently using this implementation of MessagePack for serializing/deserializing somewhat complex structures in a project I've been developing.

I want to be able to send and receive serialized MessagePack Objects that can then be parsed and constructed in order to build the "complex structure" on the receiving end.

The complex structure in question is a Blockchain. My Blockchain class consists of an array of Blocks (Block class instances) which themselves contain an array of Transactions (Transaction class instances).

I was able to serialize the Blockchain class by implementing serialize methods in each of the classes: Blockchain::serialize(), Block::serialize(), Transaction::serialize().

After serializing the Blockchain, I would send the dump() output of that object to the receiving end, in order for it to construct it's own Blockchain from the incoming packet.

The problem is that, after serializing the Blockchain, I can access it's elements on the Sender Node (the one where the serialized blockchain originated from); however, I cannot to this on the Receiver Node. I get the MessagePack dump string but I am not able to parse the incoming buffer. I've checked the parse error and it's populated with the message "end of buffer".

I'm not sure if this problem has to do with the way I try to serialize and construct the classes, or if it has to do with how I send the MessagePack dump() output to the receiver Node.

This issue might not have been very explicit but I am not sure how I can explain it better. I'm just looking for help to figure out what's causing this. I can get into further detail if someone's interested in helping out.

Also, keep in mind that I am a young dev, and I'm quite new to C++. The code may look like crap, be advised.

ar90n commented

Hi @JoaoAJMatos
Thanks for using msgpack11 in your project and the detailed report. This may be a bug of msgpack11. Could you share a msgpack file that cause this error and a minimum example that reproduces this error?

Hello @ar90n! Thanks for the fast reply.

I'm afraid it would be hard to give you an example on how to reproduce this specific error. As far as I know it might not even be a bug in msgpack11; maybe just bad usage. However, I will try to give you an in depth explanation of how I get this error using a simple example.

Suppose we have a Company class, which contains information on all the employees that work there. The Company class has an array of People (Person class instances). Each of the classes has a serialize method that serializes the class instance into a MessagePack Object.

Here is the code:

Person class

class Person
{
private:
    /* MEMBER VARIABLES */
    std::string name;
    int age;

public:
    /* CONSTRUCTORS */
    Person(std::string name, int age)
    {
        this->name = name;
        this->age = age;
    }

    Person(std::string serializedBuffer) // Constructs a person from a serialized buffer
    {
        std::string err;
        msgpack11::MsgPack msgpk = msgpack11::MsgPack::parse(serializedBuffer, err);
        
        this->name = msgpk["name"].string_value();
        this->age = msgpk["age"].int_value();
    }

    /* PUBLIC FUNCTIONS */

    msgpack11::MsgPack serialize()
    {
        using namespace msgpack11;

        MsgPack data = MsgPack::object {
            {"name", this->name},
            {"age", this->age}
        };

        return data;
    }

    /* GETTERS */

    std::string getName() {
        return this->name;
    }

    int getAge() {
        return this->age;
    }
};

Company class

class Company
{
private:
    /* MEMBER VARIABLES */
    std::vector<Person> employees;

public:
    /* CONSTRUCTORS */
    Company(std::vector<Person> people)
    {
        this->employees = people;
    }

    Company(std::string serializedBuffer) // Constructs a company from a serialized buffer
    {
        std::string err;

        msgpack11::MsgPack msgpk = msgpack11::MsgPack::parse(serializedBuffer, err);
        for (auto& element : msgpk["employees"].array_items())
        {
            this->employees.push_back(Person(element.dump())); // Creates a new Person object from the serialized buffer
        }
    }

    /* PUBLIC FUNCTIONS */

    msgpack11::MsgPack serialize()
    {
        using namespace msgpack11;

        MsgPack::array temp;

        for (auto& person : employees)
        {
            temp.push_back(person.serialize());
        }

        MsgPack pack = MsgPack::object {
            {"employees", temp}
        };

        return pack;
    }

    void printEmployees()
    {
        for (auto& person : employees)
        {
            std::cout << person.getName() << " " << person.getAge() << std::endl;
        }
    }
};

Main function example

int main()
{
    using namespace msgpack11;
    std::string err;

    Person person1("John", 30);
    Person person2("Rita", 25);
    Person person3("Pedro", 34);

    Company myCompany({person1, person2, person3});

    std::string serializedCompany = myCompany.serialize().dump();
    std::cout << "Serialized company: " << serializedCompany << std::endl;

    Company myNewCompany(serializedCompany);
    myNewCompany.printEmployees();

    return 0;
}

This example illustrates what I am trying to do in a much simpler way.

Basically, what I need to do is send the serializedCompany buffer to another computer in the network. That computer, after receiving the buffer, must construct it's own company instance from the incoming buffer.

The problem is that, when I do this in my project, the receiver does not receive the full buffer, and therefore cannot construct a new Blockchain class. And that is why the err variable is populated with the message end of buffer.

The real problem though, is that I don't know how to solve this in my specific use case, and I was trying to look for someone who could help me.

It would help if you took a look at the code of the project.

We can traceback the error to the syncChains() method here, where I try to create a new Blockchain instance from the incoming buffer.

Edit:

This might as well be an issue with the way I send data over the sockets. As I said I'm new in C++ so I am finding out stuff along the way.

ar90n commented

Hi @JoaoAJMatos
Thanks for your kind sample codes and information. I think that this trouble is caused by mishandling received data.
My guess is the following.

As you know, MessagePack is a binary serialization format. This 'binary' means that its serialized data contains null characters whose value is '\0'. So we cannot handle these values as a C language string that ends with a null character.

But this line constructs std::string instance from C language string. So if response_buffer contains null characters in the middle of itself, the constructor of std::string will stop reading the contents of response_buffer and construct an instance whose value is incompleted. I think this problem will be solved by adding the length of the content to the constructor of std::string.

The following code is a simulation to reproduce the above situation. To mimic network communication, this code converts serizelizedCompany into a C language string and then creates a std::string instance.

int main()
{
    using namespace msgpack11;
    std::string err;

    Person person1("John", 30);
    Person person2("Rita", 0);
    Person person3("Pedro", 34);

    Company myCompany({person1, person2, person3});

    std::string serializedCompany = myCompany.serialize().dump();
    std::cout << "Serialized company: " << serializedCompany << std::endl;

    const char* recvData = serializedCompany.c_str();
    const int  recvDataLength = serializedCompany.length();

    std::string recvSerializedCompany(recvData);  // this line doesn't work
    //std::string recvSerializedCompany(recvData, recvDataLength); // this line works
    std::cout << "Received Serialized company: " << recvSerializedCompany << std::endl;

    Company myNewCompany(recvSerializedCompany);
    myNewCompany.printEmployees();

    return 0;
}

The result of the above code is the following. We can see that deserialization fails.

$ ./a.out
Serialized company: employeesagenameJohnagenameRitaage"namePedro
Received Serialized company: employeesagenameJohnage

After switching the construction of recvSerializedCompany, we get the following result. It works correctly.

$ ./a.out
Serialized company: employeesagenameJohnagenameRitaage"namePedro
Received Serialized company: employeesagenameJohnagenameRitaage"namePedro
John 30
Rita 0
Pedro 34

I'm happy that this will be useful for you.

Thank you once more for the reply.
You are completely right, I totally missed that!
I will try your fix and edit this reply if it works.

I cannot thank you enough for your time and effort :)

(Btw most of the code you see there is for debugging purposes and to try to find a work-around. Everything will be clean in the final project)

I have been testing out some stuff, and I realized something.
How can I send the serialized data through the socket, if I can't get the full buffer with c_str()?
Everything works as expected on the receiving end, the only problem is that I cannot send the full buffer to the client.

For example, how would I send recvSerializedCompany to my client in order for him to deserialize the buffer?

Do you have any idea of how I could solve this?

ar90n commented

The task of c_str() is the only returning pointer to the address of the head of the buffer. So it's difficult to understand that we cannot get the full buffer with c_str().
I looked into the codes which send some data in NodeServer.cpp and NodeClient.cpp. I guess that the send function cannot send the whole given data. Because it is too large for the internal buffer.

Please check the return value of send. It means the size of data that has been sent. So if this value is less than the buffer size you pass to send, the latter part of the data has not been sent yet.