raydac/java-binary-block-parser

Conditional parsing

mmindenhall opened this issue · 1 comments

Hi Igor,

Here's another feature to consider. It would be very cool (IMHO) to have support for parsing sections of a script only if a conditional expression evaluates to true (the condition being determined from data that was already parsed).

For example, the message format I have been working with is documented here. This message format has a lot of optional fields or sets of fields, with a bit or byte indicating whether they are included.

The first byte of the message indicates whether the "options header" is present (msb set), and if so, what other sets of fields (all optional) are present within the header. There's an "options extension" section within the options header, that includes additional optional fields. The message header then has a field to indicate which message type will follow, and the message types also have optional sets of fields indicated by bits. It would be really cool if a single parse script could describe this type of message, with conditional expressions determining which sections of the script get parsed, according to what is there.

Just FYI, here's how I accomplished parsing a message type using the library (plus my extensions for strings, packed decimal, conversion to map):

  1. Create the JBBPBitInputStream wrapper around the ByteArrayInputStream outside of the parser so it could be reused by subsequent parsers while keeping the correct position
  2. Parse the first byte into a struct.
  3. If msb is set, parse the option header as follows:
    1. Determine which data sections are within the options header, as indicated by other bits in the first byte
    2. Concatenate together the parse scripts for the data sections that are present into a single string
    3. Parse the rest of the options header, except the options extension
    4. Repeat the above conditional parsing strategy for the options extension (if present)
  4. If msb is not set, reset() the input stream and start with the message header
  5. Parse the message header
  6. Determine the message type from the message header
  7. Parse the message type (including optional fields -- e.g., "extension strings" in message type 3)

All of that requires 5 separate calls to JBBPParser.prepare(...).parse(...) to parse a message, but with conditional parsing it could be done with a single parse.

I was thinking that conditional expressions could be implemented at the struct level within the DSL syntax, similar to how expressions can be used to calculate the extra data for fields. The script to parse the options header might look something like this:

options_header {
  options_byte {
    bit mobile_id;
    bit mobile_id_type;
    bit auth_word;
    bit routing;
    bit forwarding;
    bit response_redir;
    bit options_ext;
    bit present;
  }
  reset:(!present) {
    // reset the options_byte just read
    reset$$:1;
  }
  mobile_id:(present && mobile_id) {
    ubyte mobile_id_length;
    bcd:(mobile_id_length) mobile_id;
  }
  mobile_id_type:(present && mobile_id_type) {
    ubyte mobile_id_type_length; 
    ubyte mobile_id_type;
  }
  authentication_word:(present && auth_word) {
    ubyte authentication_length; 
    int authentication;
  }
  routing:(present && routing) {
    ubyte routing_length; 
    byte[routing_length] routing;
  }
  forwarding:(present && forwarding) {
    ubyte forwarding_length;
    byte[forwarding_length] forwarding;
  }
  response_redirection:(present && response_redir) {
    ubyte response_redirection_length;
    ubyte[4] response_redirection_addr;
    ushort response_redirection_port;
  }
  options_extension:(present && options_ext) {
    ubyte options_extension_length;
    bit esn_enabled;
    bit vin_enabled;
    bit encryption_service_enabled;
    align;
    options_extension_esn:(esn_enabled) {
      ubyte esn_length;
      bcd:(esn_length) esn;
    }
    options_extension_vin:(vin_enabled) {
      ubyte vin_length;
      str:(vin_length) vin;
    }
    options_extension_encryption:(encryption_service_enabled) {
      ubyte encryption_length;
      byte[encryption_length] encryption_service_bytes;
    }
  }
}

I also had such needs as I developed emulator of old computer, there was format with several subsets and with some specific inside organization

several parsers with different scripts can work with the same JBBP stream and it allows you to make logic just on level of java, you could implement several parsers for different parts of protocol and process logic on level of java , may be such approach would be much faster and flexible because anyway I can't implement such strong logic mechanism as Java on script level :)