Variations on Stretched String
security-curious opened this issue · 4 comments
This is a follow-up to my posting on #8, but creating a different issue as it is more focused on the variations on the stretched string than the early return or comment out.
You were correct that I didn't mention stretched string in #8 because I assumed it would work in Ruby as well. Stretched string doesn't really interest me that much though because as you noted in your paper:
there are other, perhaps simpler, ways that an adversary can cause a string comparison to fail without visual effect....such as the Zero Width Space
What might be nice if if we could use Bidi to not only cause a conditional to fail (just like ZWSP) but to cause the condition to pass. I don't think it is possible with a string but we could stretch other things in the language to achieve this. Your paper really only mentions comments and strings as allowing Unicode but depending on the grammer other tokens can have Unicode characters.
Below are three other types of stretching that allow us to evaluate a conditional to true instead of failing the conditional. I am using Ruby because I know it well, but I am guessing some of these could be applicable to other languages. If they are of interest perhaps we can improve the examples, see how they are applicable to other languages and include them with your examples. With all of these examples the syntax highlighting somewhat gives the issue away but perhaps in a big block of other code it wouldn't be noticed.
Stretched Regex
The most obvious alternative to a stretched string literal is a stretched regular expression. I work on a real-world application where the roles are stored as a comma-separated string for historical reasons. So an admin will have:
user.roles = 'admin,manager,user'
while a regular user might have just:
user.roles = 'user'
I could conceive of a method to see if the user is an admin defined as:
def admin?
roles =~ /admin/
end
Using that as my example scenario consider the below impl:
class User
attr_accessor :roles
def admin?
@roles =~ /admin|user/ #/ # Restrict from
end
end
user = User.new
user.roles = 'user,manager'
if user.admin?
puts 'admin!'
else
puts 'regular user :('
end
The comment seems a bit odd with the extra |
, /
and #
characters. But none of that should matter since everything after the #
should be ignored. If you run the above you would expect it to output regular user :(
but instead it outputs admin!
.
With more effort we might be able to reduce the extra characters in the comment. Also in Ruby you can choose your regex deliminators if you want. These are all equivalent:
- /admin/
- %r[admin]
- %r!admin!
The ability to chose your deliminator might help you choose a character to appear in the comment that is more believable.
Stretched List
Another things we can stretch is a list of strings. In Ruby that is defined as:
%w[one two three four five six]
This is just a syntactical upgrade to:
['one', 'two', 'three', 'four', 'five', 'six']
As with regex we can choose our deliminator so this is also the same:
%w!one two three four five six!
Now we can inject into our list with Bidi:
role = 'User'
privileged = %w!Admin Manager User! # ! # Don't include
if privileged.include? role
puts 'admin!'
else
puts 'regular user :('
end
Here I am using that feature to choose my deliminator and picking !
to make the comment more believable. !
is not normally used so I could have also just made my comment say # Don't include User] #
and someone might think it was just an extra character.
Stretched Identifiers
My final variation is to stretch a identifier. In Ruby a identifier can be made of unicode characters. For example:
😡 = 'Some error message'
STDERR.puts 😡
So lets put some of our Bidi control characters in our variable name:
role= 'Admin' # # Condition will ensure 'User' ! = 'User'
if role == 'Admin'
puts 'admin!'
else
puts 'regular user :('
end
There might be other things you can do with this besides assignment.
Ok
@security-curious This is absolutely fantastic!
These are all great points. In the Trojan Source paper, we focused on constructs that we knew to be present across many languages, which ultimately resolved to comments and string literal. Regex literals a clever extension of this in the languages that support them. Although this is not relevant to all major languages, it is relevant to some such as Ruby and (I suspect) JavaScript.
The stretched identifiers description is the most interesting to me, however. I'm shocked that Ruby allows control characters in identifier names...I suspect there's all sorts of adversarial things you can do with this. I like the stretched identifier example above quite a lot, and for those following along here's a visualization of the underlying encoding in @security-curious's example:
I'm entirely open to adding a Ruby/
directory in this repo containing relevant examples. @security-curious please feel free to make a PR with any examples that you would like, ideally following the format used for examples in other languages as closely as possible.
If identifiers are what interests you keep in mind that I'm not just taking about variables. Modules, classes, constants, methods, etc. The below is a valid Ruby program:
module A📦
class B🎓
C💎 = 3.14
def 🔴 r
C💎 * r**2
end
end
end
puts A📦::B🎓.new.🔴 8
Constants must start with an uppercase letter. Hence the C
before the 💎
. Classes and modules in Ruby are just constants pointing to an instance of a module or class. So:
class A
end
Is the same as:
A = Class.new
This is the reason I needed to prefix the classes and modules with a uppercase letter but the remaining letters can be any unicode value. Methods and variables don't have that restriction. So while my example was about assigning a variable you might be able to do other trickery with method, class, constant and module names. Really any identifier.
I did reach out to the Ruby team regarding all this and they felt addressing this at the interpreter level was not the right solution. I guess there is a debate regarding "defense in depth" vs maintenance cost of playing wack-a-mole with odd Unicode characters. I can see both sides of the argument.
Going to close this as well since I just wanted to bring up the alternatives to see if they are of interest. I might later add the Ruby PR as you suggested. In it I can include all the strategies your paper covered as applicable to Ruby as well as maybe some of these variations.