Union type field is not read properly between flatc -> flatcc

Question

Union type field is not read properly between flatc -> flatcc

bzyx opened this issue 2 years ago · 9 comments

I have prepared the following flatbuffers schema to show the case

namespace TestMSG;

table MsgPing {
	text: string;
	length: ushort;
}

table MsgPong {
	text: string;
	length: ushort;
}

union MessageUnion { MsgPing, MsgPong}

table ComboMessage {
	message: MessageUnion;
}

root_type ComboMessage;

The ComboMessage message is a container for any type of message defined by the union.

For test purposes I've used flatc to generate Python wrapper, created a message and wrote it to a file:

import flatbuffers
from TestMSG import ComboMessage
from TestMSG import MessageUnion
from TestMSG import MsgPing
from TestMSG import MsgPong

TEXT = "Hello Python!"

# Generate the message
builder = flatbuffers.Builder(1024)
textToUse = builder.CreateString(TEXT)

MsgPing.Start(builder)
MsgPing.AddText(builder, textToUse)
MsgPing.MsgPingAddLength(builder, len(TEXT))
ping = MsgPing.End(builder)

ComboMessage.Start(builder)
ComboMessage.AddMessage(builder, ping)
ComboMessage.AddMessageType(builder, MessageUnion.MessageUnion().MsgPing)
combo_message = ComboMessage.End(builder)

builder.Finish(combo_message)
buf = builder.Output()

with open("flatbuffer_msg.bin", "wb") as f:
    f.write(buf) 

print(f"{buf!r}")

Then using flatcc I've prepared reader sample:

#include <stdio.h>
#include "testmsg_test_reader.h"

int main(void) {
    char buffer[1024];
	
    FILE *in_file = fopen("flatbuffer_msg.bin", "rb");

	size_t size = fread(buffer, sizeof(char), 1024, in_file);
    printf("Read %ld\n", size);
    
	TestMSG_ComboMessage_table_t msg = TestMSG_ComboMessage_as_root(buffer);
	TestMSG_MessageUnion_union_type_t type = TestMSG_ComboMessage_message_type_get(msg);
	printf("Message type is: %s [%d]\n", TestMSG_MessageUnion_type_name(type), type);

    fclose(in_file);
	return 0;
}

I'd expect to get the Message type to be shown as "MsgPing", instead the output of the application is:

Read 60
Message type is: NONE [0]

Verification snipped using Python shows the correct type:

with open("flatbuffer_msg.bin", "rb") as f_read:
    read_buf = f_read.read()
    read_message = ComboMessage.ComboMessage.GetRootAs(read_buf, 0)
    print("Read message type: ", read_message.MessageType())

Read message type: 1

Answer 1 · 2022-05-20T12:14:42.000Z

Could you please try dump the hex content of the binary file using the new feature

https://github.com/google/flatbuffers/tree/master/tests/annotated_binary

It requires a recent flatc build, I don't think it has been released yet.

Answer 2 · 2022-05-20T12:25:06.000Z

Sure thing I've build both flatc and flatcc from the latest sources:


// Annotated Flatbuffer Binary
//
// Schema file: testmsg_test.fbs
// Binary file: flatbuffer_msg.bin

header:
  +0x00 | 0C 00 00 00             | UOffset32  | 0x0000000C (12) Loc: +0x0C | offset to root table `TestMSG.ComboMessage`

vtable (TestMSG.ComboMessage):
  +0x04 | 08 00                   | uint16_t   | 0x0008 (8)                 | size of this vtable
  +0x06 | 0C 00                   | uint16_t   | 0x000C (12)                | size of referring table
  +0x08 | 07 00                   | VOffset16  | 0x0007 (7)                 | offset to field `message_type` (id: 0)
  +0x0A | 08 00                   | VOffset16  | 0x0008 (8)                 | offset to field `message` (id: 1)

root_table (TestMSG.ComboMessage):
  +0x0C | 08 00 00 00             | SOffset32  | 0x00000008 (8) Loc: +0x04  | offset to vtable
  +0x10 | 00 00 00                | uint8_t[3] | ...                        | padding
  +0x13 | 01                      | UType8     | 0x01 (1)                   | table field `message_type` (UType)
  +0x14 | 0C 00 00 00             | UOffset32  | 0x0000000C (12) Loc: +0x20 | offset to field `message` (union of type `MsgPing`)

vtable (TestMSG.MsgPing):
  +0x18 | 08 00                   | uint16_t   | 0x0008 (8)                 | size of this vtable
  +0x1A | 0C 00                   | uint16_t   | 0x000C (12)                | size of referring table
  +0x1C | 08 00                   | VOffset16  | 0x0008 (8)                 | offset to field `text` (id: 0)
  +0x1E | 06 00                   | VOffset16  | 0x0006 (6)                 | offset to field `length` (id: 1)

table (TestMSG.MsgPing):
  +0x20 | 08 00 00 00             | SOffset32  | 0x00000008 (8) Loc: +0x18  | offset to vtable
  +0x24 | 00 00                   | uint8_t[2] | ..                         | padding
  +0x26 | 0D 00                   | uint16_t   | 0x000D (13)                | table field `length` (UShort)
  +0x28 | 04 00 00 00             | UOffset32  | 0x00000004 (4) Loc: +0x2C  | offset to field `text` (string)

string (TestMSG.MsgPing.text):
  +0x2C | 0D 00 00 00             | uint32_t   | 0x0000000D (13)            | length of string
  +0x30 | 48 65 6C 6C 6F 20 50 79 | char[13]   | Hello Py                   | string literal
  +0x38 | 74 68 6F 6E 21          |            | thon!
  +0x3D | 00                      | char       | 0x00 (0)                   | string terminator

padding:
  +0x3E | 00 00                   | uint8_t[2] | ..                         | padding

Here is the binary file itself:

hexdump --canonical flatbuffer_msg.bin 
00000000  0c 00 00 00 08 00 0c 00  07 00 08 00 08 00 00 00  |................|
00000010  00 00 00 01 0c 00 00 00  08 00 0c 00 08 00 06 00  |................|
00000020  08 00 00 00 00 00 0d 00  04 00 00 00 0d 00 00 00  |................|
00000030  48 65 6c 6c 6f 20 50 79  74 68 6f 6e 21 00 00 00  |Hello Python!...|
00000040

Answer 3 · 2022-05-20T12:36:25.000Z

From a quick inspection, that buffer looks OK, and the type should be 1 and not 0, so it appears that something is wrong with flatcc or the way that it is being used.

Your char buffer is not aligned correctly - a common mistake - but that is unlikely to be the problem if you are using an x86 based system - it will however affect the flatcc verifier if you choose to call it.

I am involved in projects where flatcc is used extensively with unions that interoperate with Go and Javascript, so in general unions should be working.

I will need to have a closer look. I suspect it might specifically be the direct accessor to the union type that is somehow wrong: TestMSG_ComboMessage_message_type_get

You could try to get the union as a struct with two fields: a type and an offset and see what that returns - as per documentation.

And just to rule it out: sometimes people test something different from what they think that they are testing, so please double check it is the correct buffer that you are reading.

Answer 4 · 2022-05-20T12:42:51.000Z

Note that there are 0x40 bytes = 64 dec. bytes in the hexdump.
In C you are only reading 60 bytes. Truncation is not necessarily a cause of the problem at hand, but it might suggest that you are reading another buffer.

Answer 5 · 2022-05-20T13:05:04.000Z

When I compile your program with the buffer you provided, I get the expected result:

Read 64
Message type is: MsgPing [1]

Answer 6 · 2022-05-20T13:21:59.000Z

You are right, in both cases. First I messed up with the files and after double check, I did observe that the result is okay, but the problem I have is XY-kind problem and I tried to reproduce it first on the X86 platform. But it didn't work for me when I used Zephyr based ARM embeeeded system.

EDIT:
After making sure that I have the same fbs file and exactly the same code on each platform it started working for me.

You did mention alignment issues, what should I align and how? I want to make sure that the code will always work.

Answer 7 · 2022-05-20T13:33:55.000Z

I wrote the below before your edit, but it still applies:

I am not familiar with ARM Zephyr. Recent ARM appear to not mind unaligned access much. Some relatively new ones take a performance hit, but the latest not so much - I have no direct experience with this. Other older or embedded / low-end platforms might object more aggressively, but then you would expect a segfault.

Another issue is that older ARMs might be big endian. While this should not be a problem for flatcc, and the binary buffer is the same in all cases, it might trigger some code paths that are not regularly testet.

However, flatcc v0.4 has been extensively test with big endian.

As to your new question on alignment:

Alignment is based on the largest datatype stored in the buffer. This is minimum 4 bytes because the offset to the root table is 4 bytes. If you have an int64 type or a double float, it will be 8 bytes. If you explictly use a FlatBuffer attribute to align structs to more than that, you will need to align to that.

Since not all buffers of the same schema necessarily use the largest data type, flatcc builder can give you information on the required alignment for the actual buffer. But generally, you should know in advance based on the schema.

In your test program you can use alignas(4) char buf[1024]; (from memory). If you use malloc, it depends on the platform, but most implementations will align to at least 8 bytes.

Even if your platform does not support alignas, it will be available with flatcc due the portable library. It is needed to ensure structs are aligned correctly in generated code.

Answer 8 · 2022-05-20T13:37:23.000Z

Oh, and there is a portable aligned_alloc and aligned_free with flatcc that you can use if you need more than what malloc provides. See test programs in flatcc, they use these. aligned_alloc is C11 std., but aligned_free is flatcc specific and defaults to free on C11, but other platforms need specialized free operations for aligned allocation, so flatcc needs aligned_free to accommodate.

Answer 9 · 2022-05-20T13:48:30.000Z

So I wasn't clear in my last answer.
I told you how to find out how much to align, but not what to align.
You must align the memory buffer where you read the buffer from.
Some CPU's have an address system where an address must be aligned to 4 bytes if they try to read a word of 4 bytes an so forth. More recent CPU's and Intel for a long time, can read from other addresses, but internally it typically means the CPU makes a special case and reads two words, then shifts the result. This is slower on some hardware, but barely noticable on modern high end hardware. However, you still have a similar problem when reading across cache lines - which will work, but it will be slower. This is not directly relevant to the discussion, but this is one reason why FlatBuffers makes it possible to align to more than 8 bytes along with hardware mapping of I/O & graphics.