logstash-plugins/logstash-filter-json

Plugin crash due to invalid UTF-32 character

ThomasdOtreppe opened this issue · 2 comments

I'm trying to parse eve.json (from Suricata IDS) and logstash encountered an issue decoding some stuff and keeps crashing.

I get a lot of the following (basically 'Error: Invalid UTF-32 character 0x2274696d(above 10ffff)'):

{:timestamp=>"2015-09-28T10:07:16.072000-0600", :message=>"A plugin had an unrecoverable error. Will restart this plugin.\n  Plugin: <LogStash::Inputs::File path=>[\"/var/log/suricata/eve.json\"], sincedb_path=>\"/var/logstash/suricata.db\", sincedb_write_interval=>1, codec=><LogStash::Codecs::JSON charset=>\"UTF-8\">, type=>\"suricata\", start_position=>\"beginning\", tags=>[\"eve\"], debug=>false, stat_interval=>1, discover_interval=>15, delimiter=>\"\\n\">\n  Error: Invalid UTF-32 character 0x2274696d(above 10ffff)  at char #527, byte #2111)", :level=>:error}

It would be nice if it either skipped the line or ignored the character.

Is there any way to know on which line it is crashing? The file is huge and I cannot check every single line manually.

It might be one of those lines but I'm not entirely sure:

{"timestamp":"2015-07-30T09:17:17.021800","event_type":"fileinfo","src_ip":"208.71.44.30","src_port":80,"dest_ip":"192.168.3.18","dest_port":62021,"proto":"TCP","http":{"url":"\/c?s=782201015&t=tSRum39Fl35A8Ge4,0.7374701946973801&_I=&_AO=0&_NOL=0&_R=http:\/\/football.fantasysports.yahoo.com\/f1\/398572\/starters&_K=3.18.3\u0005_pl\u00031\u0004A_v\u00033.18.3\u0004_bt\u0003rapid\u0004A_sid\u0003ys91JmOhY7AESVDH\u0004_w\u0003football.fantasysports.yahoo.com\/\u0004pd\u0003home\u0004chan\u0003\u0004_ts\u00031438269411&_C=sec\u0003yfa-renew-league\u0004slk\u0003Renew\u0004_p\u00030","hostname":"geo.yahoo.com","http_refer":"http:\/\/football.fantasysports.yahoo.com\/","http_user_agent":"Mozilla\/5.0 (Windows NT 6.3; WOW64) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/44.0.2403.125 Safari\/537.36"},"fileinfo":{"filename":"\/c","state":"CLOSED","stored":false,"size":43}}

Where it's trying to interpret \u00031438269411

Or here is another one:

{"timestamp":"2015-07-30T09:18:02.116291","event_type":"fileinfo","src_ip":"208.71.44.30","src_port":80,"dest_ip":"192.168.3.18","dest_port":62028,"proto":"TCP","http":{"url":"\/p?s=782205232&t=etdgDqzLWbdlk9lB,0.028852603863924742&_I=&_AO=0&_NOL=0&_R=http:\/\/football.fantasysports.yahoo.com\/&_P=3.18.3\u0005_pl\u00031\u0004A_v\u00033.18.3\u0004_bt\u0003rapid\u0004A_sid\u0003Smz72cxjxSP5mTvZ\u0004_w\u0003football.fantasysports.yahoo.com\/f1\/reg\/renewleague?renew=331_873236\u0004site\u0003sports\u0004ver\u0003uh3s\u0004psp\u0003standard\u0004lang\u0003en-us\u0004uh_test\u0003acctswitch,acctswitch\u0004_ts\u00031438269436\u0004t1\u0003a1\u0004t2\u0003uh-d\u0004t4\u0003usr-mu\u0004t3\u0003tl-lst\u0004t5\u0003acct-num-1\u0004slk\u0003acct-ct\u0004_E\u0003secvw","hostname":"geo.yahoo.com","http_refer":"http:\/\/football.fantasysports.yahoo.com\/f1\/reg\/renewleague?renew=331_873236","http_user_agent":"Mozilla\/5.0 (Windows NT 6.3; WOW64) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/44.0.2403.125 Safari\/537.36"},"fileinfo":{"filename":"\/p","state":"CLOSED","stored":false,"size":43}}

Where it thinks the value to interpret is \u00031438269436

I'm seeing the same error on logstash.log, but the output to redis server seems still working fine.