kaitai-io/kaitai_struct

Kaitai Struct use an field (u1) as type chooser or be part of the next attribute (str)

Kaipheu opened this issue · 2 comments

Hello, thanks for that project very useful !

I've tried for 1 day to describe a part of a file format.

Two u1 "del" are used as delimiter and to know the structure of the data between. Those blocks are chained, like this:

struct : del|content|del|del|content|del...

values : 01|AC4A55|01|03|0022AA|03...

So I can describe that as:

seq:
  - id: delimiter_start
    type: u1
    enum: content_enum_type
  - id: content
    type: 
      switch-on: delimiter_start
      cases:
      "content_enum_type::one":   type_one
      "content_enum_type::two":   type_two
      "content_enum_type::three": type_three
  - id: delimiter_stop
    type: u1
types:
  type_one:
    seq:
     -id: A
      type: u1
     -id: B
      type: u1
     -id: C
      type: u1
  type_two:
    seq:
     -id: D
      type: u2
     -id: E
      type: u1

  type_three:
    seq:
     -id: F
      type: u4
enum:
   content_enum_type:
      0x1: type_one
      0x2: type_two
      0x3: type_three

This is working well, but I have one case where we have to store UTF-8 strings because they haven't delimiters, i.e.: del|content|del|string_content|del|content|del

So it's here I'm blocked, I need to read the delimiter_start but eventually not storing it.

I've tried to use instances but I can't have the position of delimiter_start, I can't make a substream to deal with relative pos because the line of content isn't fixed (can't use size) and the terminator could be two different values (can't use terminator).

Is there a solution I'm missing or maybe a new feature has to be implemented?
I have tried several things but am always stuck in the fact if delimiter_start is read I can't "suppress" resulting in :
delimiter_start = H string = ELLO WORLD.

I also tried this

...
  - id: delimiter_start
    type: u1
    enum: content_enum_type
    if delimiter_start == content_enum_type::one
#    if delimiter_start == content_enum_type::two
#    if delimiter_start == content_enum_type::three
...

But this doesn't work, the compiled JS looks like:

...
if(this.delimiterStart == myType.content_enum_type.ONE){
   this.delimiterStart = this._io.readU1();
}
...

this.delimiterStart is undefined before the readU1 so the if is always false.

Sounds like your "stopping delimiter" is present in some cases and missing in some others. Why not include it into types (e.g. type_one, type_two, type_three), where it is necessary, but omit it from the UTF-8-related type where it's not needed?

Hello, thank you to answer !
Yes a can do it but the field which haven't delimiters will have its first byte parse as delimiters and not as content of the field.
Some thing like : delimiter_start = H, string = ELLO WORLD.