Refactor message header alignment and "raw" types to fit a clear pattern
skliper opened this issue · 8 comments
Is your feature request related to a problem? Please describe.
Aligned version of message headers currently in SB, shows different handling of the base type.
cFE/fsw/cfe-core/src/inc/cfe_sb.h
Lines 150 to 163 in 9804b59
Describe the solution you'd like
See discussion below.
Describe alternatives you've considered
None.
Additional context
Brought up as part of #777/#998 review.
Requester Info
Jacob Hageman - NASA/GSFC (from CCB discussion)
Regarding discussion that came up in today's CCB - this may also affect/determine how we proceed with #689:
This primarily deals with the alignment of messages. The fundamental concern - and what needs to be distinguished here - is what module is actually imposing the alignment requirement.
Messages themselves only need to be aligned sufficiently for the types they actually contain (i.e. a message containing only uint8
bytes needs no special alignment, while a message containing any uint32
value needs 32 bit alignment, and so forth.
However, the software bus implementation (CFE_SB) internally aligns all of its buffers, and it uses the (historical) CFE_SB_Msg_t*
pointer type as its interface type for both CFE_SB_SendMsg()
and CFE_SB_RcvMsg()
as well as the other msg introspection calls.
Therefore -- since the extra alignement is actually driven by CFE_SB and its underlying implementation - my recommendation is to put the unions here. So:
-
The
msg
module should provideCFE_MSG_Message_t
,CFE_MSG_TelemetryHeader_t
, andCFE_MSG_CommandHeader_t
types. In this case theCFE_MSG_Message_t
should be the actual structure type, not a union. -
The
CFE_SB
module should provideCFE_SB_Msg_t
, andCFE_SB_CmdHdr_t
andCFE_SB_TlmHdr_t
- which are simply variants of those types provided by MSG but aligned properly for usage by software bus. (that is, they areunion
-ized here) -
The "raw" messages in apps should be defined using only types from the MSG module (i.e. in the respective
*_msg.h
file). -
But when instantiating a message buffer locally for sending (i.e. in telemetry-generating code) the buffer should be instantiated using CFE_SB_Msg_t as its base type. This way - in the event that the telemetry message has low alignment requirements - it will not generate a warning, because the buffer will be aligned to
CFE_SB_Msg_t
level - if this is greater than the TLM message itself. -
On the other hand - When receiving a command - it is OK to cast from
CFE_SB_Msg_t*
to the actual command type - because this should only be a downgrade in alignment, never an upgrade.
The reason for following this pattern is that the sizeof()
operator - when applied to the message types - will work as expected, and won't reflect extra padding that might have been added due to runtime padding for SB.
For a specific case where this distinction matters - consider CFE_ES_Restart_t
. This command has a single uint16
payload, and is traditionally expected to be 10 bytes total in length when the CMD header is added.
When sent across the software bus interface - the buffer will actually be 12 bytes or maybe even 16. But that extra padding is only necessary for runtime data alignment - the extra bytes should not be interpreted as part of the message.
If these patterns I'm suggesting above are followed then things should mostly work (I think, or at least as well as can be expected using C structs and not a proper data dictionary implementation.
If you aren't using the aligned headers to define the message, wouldn't we also need to make the raw headers end on the most restrictive boundary, so the payload will always start in the same location (and not break CFE_SB_GetUserData)? For "instantiating a message buffer locally" do you mean do the union between the raw message and CFE_SB_Msg_t
within the app? So really going back to the whole message buffer pattern that I took out in nasa/sample_app#104 for example?
typedef union
{
CFE_SB_Msg_t MsgHdr;
SAMPLE_APP_HkTlm_t HkTlm;
} SAMPLE_APP_HkBuffer_t;
If that's the case, don't CFE_SB_CmdHdr_t
and CFE_SB_TlmHdr_t
really have no use anymore?
@ejtimmon - sounds like this isn't settled yet...
we also need to make the raw headers end on the most restrictive boundary, so the payload will always start in the same location (and not break CFE_SB_GetUserData)?
If I'm interpreting correctly - you are referring to the possibility of a gap occurring between the header as defined by MSG and the actual start of data (for instance if the TLM header is 12 bytes and the payload contains a double
which requires 8 byte alignment).
As you noted in #777 though, the CFE_SB_GetUserData()
function is inherently fragile and one should use the actual structure definition for this. (e.g. use something like offsetof(mymessage_t, Payload)
or &mymessage->Payload
). It is only an issue when you are making assumptions about how the compiler may (or may not) pad something out. (as an aside I'm tempted to say CFE_SB_GetUserData should be deprecated, but in the event that one moves toward a dictionary-based solution, it becomes more possible to implement this reliably).
Pedantically speaking there also no guarantee that using CFE_SB_CmdHdr_t
(the union) will avoid implicit padding any better than using CFE_MSG_CommandHeader_t
(actual type) does. Because avoiding padding requires an assumption about what the worst-case alignment will be for any payload will be - and accounting for it in the respective struct/union. So - as long as MSG types already account for this by explicitly making CFE_MSG_CommandHeader_t and CFE_MSG_TelemetryHeader_t to be a multiple of 8 bytes instead of 4 - this is probably as good as can be done when using C structs to define messages. Adding a union won't avoid padding any better.
Again, padding avoidance is based on the assumption that the most probable worst case payload alignment will be 8 bytes, and it is therefore unlikely that the compiler will add extra padding in this case. "assumption", "probable" and "unlikely" being the key weasel-words here, because technically any compiler can add padding or extra alignment wherever it sees fit on the particular architecture it is compiling for. (i.e. it is entirely plausible that some FPU out there works better with 16 byte alignment, particularly if you use the long double
type). As a C programmer there is no way you can absolutely guarantee that a compiler won't add padding between struct
members - the C standard says that this is platform-defined.
My main point is - this can result in an unanticipated implicit padding gaps either way - whether the unionized "CFE_SB_CmdHdr_t" was used or the base structure was used to declare the struct. Compilers can do what they want - developers can only guess at the most likely memory padding, nothing is guaranteed about the layout of bits within a struct that isn't directly specified in the C standard.
Hence why I would prefer to see the struct definitions in the *_msg.h
header file(s) convey the intent of what the cmd/tlm packets contain in the simplest terms possible. The distinction is that we are not sending a CFE_SB_Msg_t
or CFE_SB_CmdHdr_t
or CFE_SB_TlmHdr_t
on the network - this exists only for internal use within SB for transport/API purposes. The CFE_MSG_CommandHeader_t
or CFE_MSG_TelemetryHeader_t
are what is intended to be sent/received, so this is how the msg structure should be defined in an app.
If that's the case, don't
CFE_SB_CmdHdr_t
andCFE_SB_TlmHdr_t
really have no use anymore?
I think so - these seem like surplus types now given the current design. Perhaps we can remove them.
If my memory serves correctly, the main reason for making CFE_SB_Msg_t
into a union in the first place was for two reasons:
- Provide a modicum of type safety for messages - Having a non void pointer base type means the user must actually declare their buffer using this type or get a compiler warning. The alternative being
void*
where anything goes (no warnings, no type checks). - But because of that type check - having it be a union with
uint32
increased its alignment requirement and thus avoided a compiler warning in the command handlers - i.e. when casting fromCFE_SB_Msg_t*
to something else - this increases the likelihood that this would be a downgrade (or equal) in alignment, not an upgrade - so no warning.
We could simplify a lot by just making the interface type in SB to be void*
instead of CFE_SB_Msg_t*
- but this would have no type checking or enforcement so its bad for safety - but it sure would make things simpler.
We need to resolve this ASAP for Caelum... realizing it's not a perfect solution, we need something good enough. I understand the lack of guarantee, but a fix that works for most systems is better than something that is "broken" on all of them. I do dislike CFE_SB_GetUserData, but even if we deprecated it I also think it's undesirable to have the payload start at different locations depending on the alignment requirements of the payload (makes ground system databases more complex).
Anyways, in an attempt to move forward I'll implement the alignment suggestion on top of #998 so we can at least get a PR in to discuss and see how the pattern settles.
My recommendation is to use the envelope/post office analogy as I mentioned in #998 (review)
- CFE SB API deals with envelopes
- CFE MSG API deals with contents
Upon receipt of a message we examine the MsgID and "open" the envelope. This can be a typecast to the CFE_MSG type (typically CFE_MSG_CommandHeader_t
) or (preferably) by accessing the correctly-typed member within the CFE_SB_Msg_t
envelope.
When sending a message we "wrap" the raw message in the envelope for sending.
The CFE SB API should only have abstractions for those basic items which are required for message transport and delivery such as its MsgId and size. These should operate on a CFE_SB_Msg_t*
type. This would also be the right place to deal with any authentication data if we ever need to implement that for security reasons.
The CFE MSG API should have abstractions for all other metadata - timestamps, command codes, checksums, and any other relevant info.
The CFE SB API should probably also provide helpers (either macros or helper functions) to facilitate the proper "packaging" and "unpackaging" of these raw messages. Right now this is basically just a typecast that is performed directly in the apps (e.g. casting of CFE_SB_Msg_t*
to its specific command type within the command dispatcher) but this could be more refined.