FIXTradingCommunity/fix-simple-binary-encoding

Why is nullValue not allowed for required field?

OleksandrKvl opened this issue · 4 comments

Since nullValue only makes sense for optional fields, why is it not allowed for required fields? There are two different properties: data range and presence, where optional presence is represented using magic value, why should this value be out of range? This might force people to use twice as large type just because they need that last bit. I propose to make nullValue a valid value for required fields and to adjust min/max for built-in types to a full range, keeping nullValues as is. For example, for uint8: min=0, max=255, null=255. The logic is that nullValue is a valid value for some type, it's just interpreted as a logical absence of value of an optional field.

@OleksandrKvl personally, I agree with your sentiment and proposed that to working group. However, it was the consensus that it was a simplification to have one domain of values for an encoding type, whether a field in required or optional.

@donmendelson it's not just a sentiment, technically speaking, to correctly represent optional values you need additional flag. I'd even propose to add such flag implicitly but it's probably too late now. Imagine situation that someone really needs those null values but a perfect SBE implementation doesn't allow them to do this because null values are invalid even for required fields. I think there application level should have a freedom of how to treat the raw data fields, making null value illegal breaks it.
Btw, another issue related to null values is null value of a composite. This just doesn't make any sense to me. If I have a composite with perfectly legal first optional field, the whole composite should be treated as a Null because of this strange rule. And it's not even possible to implement because the first field might be a static array and there's no null value for it. Only additional flag can correctly solve this problem.

@donmendelson another inconsistency is related to numInGroup. Here it says:

The minValue attribute controls the minimum number of entries, overriding the default of zero, and the maxValue attribute restricts the maximum entry count to something less than the maximum corresponding to its primitiveType.

But maximum values for primitive types are not equal to the real ones, e.g. for uint8 it's 254, not 255. However, the section above says that for uint8 the maximum number of entries is 255.
Imagine that we have C++ implementation like this:

// generic wrapper for required fields
template<typename T>
class required{
public:
    bool is_valid();  // checks that `minValue <= value <= maxValue`  && `value != nullValue`
    // ...
private:
    T value;
};

// groupSizeEncoding composite representation
class groupSizeEncoding{
public:
    required<std::uint8_t> numInGroup();
    // ...
};

// client's code:
groupSizeEncoding header = group.header();
assert(header.numInGroup().is_valid());

This code will fail if numInGroup is 255. Current design means that composite groupSizeEncoding should have one representation when it's used as a normal composite and another when it's used as a header type.
I don't know what the original intent was but to me SBE is more about binary layout, not about custom semantics. At least it should allow me to use "raw" data fields and have my own semantics at the application level. This artificial restriction of all types' valid ranges doesn't allow to do that (and keep protective is_valid-like assertions). I think that most people don't even know that uint8 in SBE has different range from uint8 in their language.

This has already been decided for SBE v1.0, a final standard. Closing this issue, but you may wish to make a proposal for v2.0.