[Bug][Classifier] TypeError: unhashable type: 'dict' when processing raw event encapsulated in a string
chunyong-lin opened this issue · 1 comments
chunyong-lin commented
Background
The PR #1077 surfaces a bug in our Parser that StreamAlert would throw an exception TypeError: unhashable type: 'dict'
when parsing TrendMicro
schema because the schema is strange!!!
Okay, the root cause is TrendMicro
events are a list of dict and encapsulated in string. The parser for this type of events will be json with json_path
configured in the schema conf file. We will hit the bug if TrendMicro
events goes to same data source where contains other events won't require json_path
.
TL;DR, the issue can be reproduced by two approaches.
- Adding a unit test
def test_parse_record_not_dict_mismatch(self):
"""JSONParser - Parse record not in dict type and doesn't match schema"""
options = {
'schema': {
'key': 'string'
},
'parser': 'json'
}
record_data = "[{\"key\": \"value\"}]"
parser = JSONParser(options)
assert_equal(parser.parse(record_data), False)
- Verify via python interpreter
python
>>> set({'a': 1})
{'a'}
>>> set([{'a': 1}])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
>>>
The full traceback is similar to
Traceback (most recent call last):
File "manage.py", line 116, in <module>
main()
File "manage.py", line 112, in main
sys.exit(not cli_runner(options))
File "/Users/SECRET_DIR_PATH_BALABALA/streamalert_cli/runner.py", line 70, in cli_runner
result = cmds[args.command](args)
File "/Users/SECRET_DIR_PATH_BALABALA/streamalert_cli/runner.py", line 126, in <lambda>
command: lambda opts, cmd=cli_command: cmd.handler(opts, config)
File "/Users/SECRET_DIR_PATH_BALABALA/streamalert_cli/test/handler.py", line 198, in handler
result = result and TestRunner(options, config).run()
File "/Users/SECRET_DIR_PATH_BALABALA/streamalert_cli/test/handler.py", line 372, in run
classifier_result = self._run_classification(event)
File "/Users/SECRET_DIR_PATH_BALABALA/streamalert_cli/test/handler.py", line 251, in _run_classification
return _classifier.run(records=[record])
File "/Users/SECRET_DIR_PATH_BALABALA/streamalert/classifier/classifier.py", line 250, in run
self._classify_payload(payload)
File "/Users/SECRET_DIR_PATH_BALABALA/streamalert/classifier/classifier.py", line 170, in _classify_payload
self._process_log_schemas(record, logs_config)
File "/Users/SECRET_DIR_PATH_BALABALA/streamalert/classifier/classifier.py", line 135, in _process_log_schemas
parsed = parser.parse(payload_record.data)
File "/Users/SECRET_DIR_PATH_BALABALA/streamalert/classifier/parsers.py", line 490, in parse
valid = valid and self._key_check(record, self._schema, self._optional_top_level_keys)
File "/Users/SECRET_DIR_PATH_BALABALA/streamalert/classifier/parsers.py", line 246, in _key_check
keys = set(record) if not optionals else set(record).union(optionals)
TypeError: unhashable type: 'dict'
Steps to Reproduce
See the background description.
Desired Change
Handle when the record is a list of dict.
chunyong-lin commented
Fixed in PR #1085