Evtx.BinaryParser.ParseException raised unexpectedly
john-corcoran opened this issue · 7 comments
Hi,
I'm seeing exception Evtx.BinaryParser.ParseException raised unexpectedly when parsing event log "Microsoft-Windows-Windows Firewall With Advanced Security%4Firewall.evtx" (extracted from a Windows Server 2012 testing instance) using evtx_dump.py.
Same behaviour seen when parsing on Windows with Python 2.7, and Ubuntu with Python 2.7 / 3.5.
Traceback from Windows with Python 2.7 as follows:
Traceback (most recent call last):
File "evtx_dump.py", line 42, in <module>
main()
File "evtx_dump.py", line 37, in main
print(record.xml())
File "C:\Python27\lib\site-packages\Evtx\Evtx.py", line 481, in xml
return e_views.evtx_record_xml_view(self)
File "C:\Python27\lib\site-packages\Evtx\Views.py", line 204, in evtx_record_xml_view
return render_root_node(record.root())
File "C:\Python27\lib\site-packages\Evtx\Views.py", line 182, in render_root_node
for sub in root_node.substitutions():
File "C:\Python27\lib\site-packages\Evtx\BinaryParser.py", line 64, in __call__
cache[key] = self.func(*args, **kw)
File "C:\Python27\lib\site-packages\Evtx\Nodes.py", line 1019, in substitutions
raise ParseException("Invalid substitution value size")
Evtx.BinaryParser.ParseException: Parse Exception(Invalid substitution value size)
I can forward the EVTX file in question if that assists?
hey @john-corcoran
Thanks for raising this issue. The easiest way to triage it will be to share the .evtx file. I'd be happy to take a peek if you send to my email: willi.ballenthin@gmail.com
@john-corcoran just wanted to let you know i still have my eye on this issue. been doing some traveling so haven't had as much time as i'd have liked to triage the bug. the evtx files you provided should be enough for me to figure things out in the next few days.
I suspect this evtx file is corrupt. For example, when I simply print the record numbers, we get output like:
[...]
180
181
182
183
184
185
186
187
188
7999193089
[exception traceback]
clearly, 7999193089
is not a valid record number. I think expected behavior would be to identify this corruption earlier, and then notify the user that something is wrong (rather than trying to interpret invalid data).
evtx_info.py
also fails:
λ C:\Users\user\env2\Scripts\python ..\scripts\evtx_info.py data\issue_45.evtx
Information from file header:
Format version : 3.1
Flags : 0x00000000
File is : clean
Log is full : no
Current chunk : 2 of 3
Oldest chunk : 1
Next record# : 189
Check sum : pass
Information from chunks:
Chunk file (first/last) log (first/last) Header Data
- ----- --------------------- --------------------- ------ ------
> 1 1 66 1 66 pass pass
2 67 123 67 123 pass pass
* 3 124 188 124 188 pass pass
Traceback (most recent call last):
File "..\scripts\evtx_info.py", line 109, in <module>
main()
File "..\scripts\evtx_info.py", line 82, in main
if not chunk.check_magic():
File "c:\users\user\documents\code\python-evtx\Evtx\Evtx.py", line 308, in check_magic
return self.magic() == "ElfChnk\x00"
File "c:\users\user\documents\code\python-evtx\Evtx\BinaryParser.py", line 211, in explicit_length_handler
return f(offset, length)
File "c:\users\user\documents\code\python-evtx\Evtx\BinaryParser.py", line 475, in unpack_string
return self.unpack_binary(offset, length).decode('ascii')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa1 in position 2: ordinal not in range(128)
seems like this file contains junk data after 3 valid chunks:
λ C:\Users\user\env2\Scripts\python ..\scripts\evtx_info.py data\issue_45.evtx
Information from file header:
Format version : 3.1
Flags : 0x00000000
File is : clean
Log is full : no
Current chunk : 2 of 3
Oldest chunk : 1
Next record# : 189
Check sum : pass
Information from chunks:
Chunk file (first/last) log (first/last) Header Data
- ----- --------------------- --------------------- ------ ------
> 1 1 66 1 66 pass pass
2 67 123 67 123 pass pass
* 3 124 188 124 188 pass pass
4 [INVALID]
5 [INVALID]
6 [INVALID]
7 [INVALID]
8 [INVALID]
9 [INVALID]
10 [INVALID]
11 [INVALID]
12 [INVALID]
13 [INVALID]
14 [INVALID]
15 [INVALID]
16 [INVALID]
commits 5d2754f and 1ed29cf improve the robustness of the code that parses chunks.
i think the issue is that when enumerating chunks (and therefore records) via the Evtx
log object, the log object doesn't take into account how many chunks are declared to be in the log file. Instead, it keeps reading until the end of the file. So, if there's junk data appended to the file, it attempts to parse this as valid data. See here:
Line 92 in 5fb7662
confirmed the theory above, and addressed the issue in df8edf4.
thanks to @john-corcoran for reporting this issue!