brianmario/yajl-ruby

Yajl can't parse strings with \v (Vertical Tab)

Closed this issue · 1 comments

Hello,

Yajl is complaining that it can't parse a feed I have, which have an \v (Vertical Tab, hexcode 0b).

Yajl::Parser.parse("{"string": "This is a\vtest"}")

Yajl::ParseError: lexical error: invalid character inside string.
                   {"string": "This is a
                                        test"}
                     (right here) ------^

    from /Users/jmonteiro/.rvm/gems/ree-1.8.7-2011.03@jobscore/gems/yajl-ruby-0.8.0/lib/yajl.rb:37:in `parse'
    from /Users/jmonteiro/.rvm/gems/ree-1.8.7-2011.03@jobscore/gems/yajl-ruby-0.8.0/lib/yajl.rb:37:in `parse'
    from (irb):19

After using gsub(/[\s\b\v]+/, " ") to clean out \b (backspace) and \v, it is working as expected.

Yajl::Parser.parse("{"string": "This is a\vtest"}".gsub(/[\s\b\v]+/, " "))

{"string"=>"This is a test"} 

I'm pretty sure that character must be escaped as \u000b in the JSON string. According to the RFC %x20-21 / %x23-5B / %x5D-10FFFF are the only characters that can remain unescaped. And \v is 0xb which is outside of that range.

Yajl.load "[\"\\u000b\"]"
 => ["\v"]