kaitai-io/kaitai_struct

Using enum members as int is broken in ruby

smx-smx opened this issue · 3 comments

The following ksy:

meta:
  id: kaitai_bug
  file-extension: kaitai_bug
  endian: le
enums:
  bug:
    0: item0
    1: item1
    2: item2
seq:
  - id: array
    type: u4
    repeat: expr
    repeat-expr: 3
instances:
  item1:
    value: array[bug::item1.as<u4>]

generates the following incorrect ruby code:

  BUG = {
    0 => :bug_item0,
    1 => :bug_item1,
    2 => :bug_item2,
  }
  I__BUG = BUG.invert
  ...
  def item1
    return @item1 unless @item1.nil?
    @item1 = array[:bug_item1]
    @item1
  end

which triggers the following error when accessing item1:

no implicit conversion of Symbol into Integer (TypeError)

because the array key is not a number.

I guess the expected output should be array[I__BUG[:bug_item1]] ?

Ok... so the issue is that i used .as<u4> instead of to_i, which is required in order to trigger enumToInt in the translateAttribute function

https://github.com/kaitai-io/kaitai_struct_compiler/blob/8aeb79a0069a708bd10cc5b4cc4e3ddf89f58289/shared/src/main/scala/io/kaitai/struct/translators/CommonMethods.scala#L167-L171

      case et: EnumType =>
        attr.name match {
          case "to_i" => enumToInt(value, et)
          case _ => throw new TypeMismatchError(s"called invalid attribute '${attr.name}' on expression of type $valType")
        }

so to_i and .as<u4> might do the same thing or not, depending on the target language.
For example, as<u4> works in Javascript and PHP, but not in Ruby (the ones i tested)

so the issue is that i used .as<u4> instead of to_i

Yes, your conclusion is correct. The thing is that only a very few type castings are actually well-defined across target languages right now - the situation described in https://doc.kaitai.io/user_guide.html#typecast is one of them. Note that the type cast is basically only designed to claim that we already have the value of the type we're casting to, we just need to tell the static type system. This means that the type casting is a no-op in dynamically-typed languages (JavaScript, Python, Ruby etc.) because such languages allow duck-typing by default, and we only need to translate it in statically-typed languages. It is not intended to do any non-trivial conversions like converting an enum value to an integer in your case.

Unfortunately, there are no checks in KSC that validate whether the type casting makes sense (the compiler allows casting anything to anything), and I'm aware that many people use it incorrectly, so we really need #696 so that the compiler can catch this and report an error.

Duplicate of #696