Implement support for compressed long values
vicodark opened this issue · 5 comments
I've tried to run eseparser to dump the SystemIndex_PropertyStore from a few Windows.edb files. Every time, most of the string data come out Chinese or similar as seen here:
{"WorkID":40,"27F-System_Search_Rank":707406378,"14F-System_FileAttributes":707406378,"4440-System_ItemFolderPathDisplay":"尕绮諚檵夭ᖬ둮처錶炵岵圌엃굠ౙ鬖拍�ᖭ솃絢献淊잦淼뮦ﲱ缽ﲹ�碼왆얉㋷쬏懩࠷麝�㳲","
I tried digging around in the code and it looks like the taggedItems buffers returned by ParseTaggedValues for these Long Text columns does not hold the string data at all. A random selection of the data stored there hex encoded looks like this:
10fb692bd6aab564b156ac96bbd16232db4cd6c2d572315c0d1783b56631586c368bd966b7560c068bf501
Any idea what's going on here?
Well you can just go ahead and ignore the above and have a good laugh at the fact that I forgot that Windows.edb can have compressed strings.
Is there a way we can automatically figure out it is compressed and decompress it?
esedbexport in libesedb does it. I didn't know till today that esedbexport does some artifact-specific processing in the tool itself for SRUM, Windows.edb, and others. For Windows.edb, it appears many of the strings are compressed by one of a few compression algos and also obfuscated with some simple bitbashing stuff, all of which esedbexport knows how to decode.
Ok lets take a look at what esedbexport does and match the specific artifact processing if possible.
If you can share a sample file (even privately) we can implement support for compressed values. We have the source code for the ese release by Microsoft so it is much easier to figure out