Ruby: `enum.to_i` implementation is buggy due to string replacement
generalmimon opened this issue · 0 comments
When stumbling upon this line of code in KSC, I got suspicious because I'm generally convinced is that any kind of string manipulation on translated expressions (i.e. whenever you don't treat the translated expressions as opaque strings) is naive, flawed and can cause hard-to-detect bugs:
enumDirectMap.replace(enumNameDirect, enumNameInverse)
It turns out that this use of String.replace()
does indeed cause an observable bug. See reproduction code below. For comparison, I wrote two .ksy specs that are very similar, one where the bug isn't triggered (enum_nested_no_bug.ksy
) and one where the bug is triggered (enum_nested_bug.ksy
):
meta:
id: enum_nested_no_bug
instances:
pet_1_to_i:
value: a_b::abc::cat.to_i
types:
a_b:
enums:
abc:
42: cat
meta:
id: enum_nested_bug
instances:
pet_1_to_i:
value: a_b_c::abc::cat.to_i
types:
a_b_c:
enums:
abc:
42: cat
diff --git 1/enum_nested_no_bug.ksy 2/enum_nested_bug.ksy
index 1d88414..c5bb137 100644
--- 1/enum_nested_no_bug.ksy
+++ 2/enum_nested_bug.ksy
@@ -1,10 +1,10 @@
meta:
- id: enum_nested_no_bug
+ id: enum_nested_bug
instances:
pet_1_to_i:
- value: a_b::abc::cat.to_i
+ value: a_b_c::abc::cat.to_i
types:
- a_b:
+ a_b_c:
enums:
abc:
42: cat
I also added a debug print in KSC to make it easy to see how the bug happens (this is at RubyTranslator.scala:67
):
--- i/shared/src/main/scala/io/kaitai/struct/translators/RubyTranslator.scala
+++ w/shared/src/main/scala/io/kaitai/struct/translators/RubyTranslator.scala
@@ -64,7 +64,9 @@ class RubyTranslator(provider: TypeProvider) extends BaseTranslator(provider)
val enumNameDirect = Utils.upperUnderscoreCase(enumTypeAndName.last)
val enumNameInverse = RubyCompiler.inverseEnumName(enumNameDirect)
- enumDirectMap.replace(enumNameDirect, enumNameInverse)
+ val res = enumDirectMap.replace(enumNameDirect, enumNameInverse)
+ println("replacing " + enumNameDirect + " with " + enumNameInverse + " in " + enumDirectMap + " -> " + res)
+ res
}
override def arraySubscript(container: Ast.expr, idx: Ast.expr): String =
$ kaitai-struct-compiler -- --verbose file -t ruby enum_nested_no_bug.ksy enum_nested_bug.ksy
parsing enum_nested_no_bug.ksy...
reading enum_nested_no_bug.ksy...
... compiling it for ruby...
replacing ABC with I__ABC in AB::ABC -> AB::I__ABC
.... writing enum_nested_no_bug.rb
parsing enum_nested_bug.ksy...
reading enum_nested_bug.ksy...
... compiling it for ruby...
replacing ABC with I__ABC in ABC::ABC -> I__ABC::I__ABC
.... writing enum_nested_bug.rb
You can already see the problem - the valid reference to the "enum inverse map" would be ABC::I__ABC
, but the compiler ended up with I__ABC::I__ABC
because of the naive string substitution. This is reflected in the generated code:
diff --git 1/enum_nested_no_bug.rb 2/enum_nested_bug.rb
index 4ddbb4e..8a8b84f 100644
--- 1/enum_nested_no_bug.rb
+++ 2/enum_nested_bug.rb
@@ -6,7 +6,7 @@ unless Gem::Version.new(Kaitai::Struct::VERSION) >= Gem::Version.new('0.11')
raise "Incompatible Kaitai Struct Ruby API: 0.11 or later is required, but you have #{Kaitai::Struct::VERSION}"
end
-class EnumNestedNoBug < Kaitai::Struct::Struct
+class EnumNestedBug < Kaitai::Struct::Struct
def initialize(_io, _parent = nil, _root = nil)
super(_io, _parent, _root || self)
_read
@@ -15,7 +15,7 @@ class EnumNestedNoBug < Kaitai::Struct::Struct
def _read
self
end
- class AB < Kaitai::Struct::Struct
+ class ABC < Kaitai::Struct::Struct
ABC = {
42 => :abc_cat,
@@ -32,7 +32,7 @@ class EnumNestedNoBug < Kaitai::Struct::Struct
end
def pet_1_to_i
return @pet_1_to_i unless @pet_1_to_i.nil?
- @pet_1_to_i = AB::I__ABC[:abc_cat]
+ @pet_1_to_i = I__ABC::I__ABC[:abc_cat]
@pet_1_to_i
end
end
It's not hard to imagine that the enum_nested_no_bug.rb
file works, whereas enum_nested_bug.rb
doesn't:
$ ruby -I /c/temp/kaitai_struct/runtime/ruby/lib -e 'require_relative "enum_nested_no_bug"' -e 'r = EnumNestedNoBug.new(Kaitai::Struct::Stream.new(""))' -e 'pp r.pet_1_to_i'
42
pp@DESKTOP-89OPGF3 MINGW64 /c/temp/ks-experiments/ruby-enum-str-replace-bug
$ ruby -I /c/temp/kaitai_struct/runtime/ruby/lib -e 'require_relative "enum_nested_bug"' -e 'r = EnumNestedBug.new(Kaitai::Struct::Stream.new(""))' -e 'pp r.pet_1_to_i'
C:/temp/ks-experiments/ruby-enum-str-replace-bug/enum_nested_bug.rb:35:in `pet_1_to_i': uninitialized constant EnumNestedBug::I__ABC (NameError)
from -e:3:in `<main>'
This bug well demonstrates that string manipulation on translated expressions should be avoided as much as possible, as it almost always has unintended consequences (this is because it's inherently the wrong tool for the task).
Git blames kaitai-io/kaitai_struct_compiler#159 authored by @ams-tschoening and merged by @GreyCat. I'm not mentioning this because I want to be mean, but I wanted to let you know about this pitfall so we can avoid it in the future. Everyone makes mistakes, but I think this kind can be easily avoided if you know what to look out for.