Walkthrough for restoring OEP and IAT for dumped executables?
TAbdiukov opened this issue · 11 comments
Hello,
I'm trying to dump the packed executable, and among other things, I encounter OEP set to 0x00000000 and IAT messed up. I currently do the following,
- Close all apps
pd -db genquick
- Run my target
pd -pid <pid>
The dumper dumps the best possible, sure; but is there a way to restore the OEP (so I can run the executable) and IAT (run anywhere else aside from the VM)? Thanks heaps <3
One suggestion I came up with inspired by https://reverseengineering.stackexchange.com/a/11272
Since the dump stores the IAT that was present at a runtime, I can either find the imports string representation in the dump (if present, which is always True in my case) or listen to the program's API calls. Either way, I do not get how can I translate the API call names to their static addresses. Any help will be appreciated
Any update to this issue?
@cglmrfreeman also interested
The current behavior right now varies on the type of resource being dumped.
There are two main memory dump scenarios supported by ProcessDump:
- In-memory PE files
- Loose executable regions in memory that have no PE header
Given that you observed an OEP of 0x00000000, I suspect ProcessDump may be dumping an in-memory PE file where the packer has deliberately wiped the entry point.
How it handles these two are a bit different:
OEP and dumping of in-memory PE files
If it's an in-memory PE module being dumped from memory, it keeps the original OEP that is specified in the PE header in-memory. This is usually right but unfortunately in the case of packed files sometimes this may no longer be a correct entry point or in some cases the packer may deliberately wipe the OEP.
Reconstructing this can be challenging guess-work depending on what the executable code looks like. Unwinding the threads might work sometimes, but only if the entry thread is still active. Otherwise I don't think there is a great way to determine it without manual research work per file that I can think of, but am open to suggestions.
OEP and dumping of loose executable regions in memory
With a loose executable region in memory, ProcessDump creates it's own PE header and import table so that it can be analyzed. Unfortunately there isn't a great way to reconstruct where the original entry point might have been into it, so it just sets an entrypoint to the very start of the region - which is sometimes right.
IAT reconstruction
The IAT reconstruction (or construction in the case of loose executable regions) method in ProcessDump is a stronger more aggressive approach than that referenced stackoverlow approach. I've found it to be quite successful. If you have an example malware file, maybe share the file Sha256 and I might be able to have a look why it didn't build the IAT correctly?
Here's how it works:
- At dump time it looks at all loaded modules in the process, enumerating all the export addresses in the process. Call these ExportAddresses[].
- When dumping a code region or PE file from memory, it:
- Finds all possible references everywhere in the dumped module to these ExportAddresses[] by a raw search for any dword/qword matches. All of these matches throughout the whole process are then used to create the new IAT.
- Increases the size of the last section in the PE file being dumped.
- Creates a new IAT that links to all the scattered dword/qwords to link them to the respective Library+ExportNames.
The advantage of this, is that it not only corrects the IAT, but it also fixes up any reference to a library export anywhere in the code. For example, if they had custom code which resolves their library addresses and saving it to global variables, like:
- void* MyShellExecute = GetProcAddress("ShellExecute")
it will automatically link these up for you within the memory dump as well. Additionally, this works even when dumping loose executable regions which have no IAT.
RE: is there a way to restore the OEP (so I can run the executable) and IAT (run anywhere else aside from the VM
Generally a memory dump from a running process will often not successfully run. Consider an example like this:
`
global void* myFileHandle;
function myFunction()
{
if(myFileHandle == null)
{
myFileHandle = ... create a real handle to a file mapping or something ..
}
... now use myFileHandle ...
}
`
In the memory dump, the myFileHandle will be saved from the running process. When you try to run the executable again, the process will think that is a valid handle, when in fact it is no longer a valid handle. There are a lot of challenges like this that will mean most of the time you can't re-execute a dumped from-memory component cleanly. It is great for static analysis though :)
Hope this helps! I'm open to suggestions, and especially pull requests as well.
Here's how I see it, and feel free to correct me where I'm wrong, because I most certainly might be:
The program takes a packed executable, and correctly creates an unpacked executable. Alright, so we have both variables. The question is, at what instruction did we first enter unpacked territory? Is there no way for the program to take what it had, what it now has, and run a secondary scan indicating where the first instructions run from an unpacked executable in memory?
There are methods to reconstruct the IAT based on having the knowledge of where the OEP is, however I have been very unsuccessful in finding the OEP of unpacked territory for a particular EXE, where Process-Dump seems to have no issue creating an unpacked EXE. If Process-Dump could somehow give a hint about the OEP for OllyDbg users (or debuggers of that type), then manual reconstruction could and would become much easier.
Based on the current version of Process-Dump, I can't seem to gather any information of how my particular EXE is unpacked or what instruction Process-Dump decided to dump the process at. I could be very wrong in assuming how this works here, but again feel free to explain why I'm wrong.
So one idea that will work for some binaries is creating a set of known entry point pattern signatures - sort of like a FLIRT signature.
Idea:
- An entry-point hash will be computed based on a partial hash of the code at the entry point (eg a length assembly decompiler and just hashing the first 50 opcodes).
- A database of entry-point signatures is built at clean-hash building time (and maybe I'll ship one with the release of the tool).
- Now when ProcessDump dumps a module with a clearly invalid OEP, it will then do a search of the module being dumped to see if it can find a match to the known-entry point hash database. It will set the OEP to the first match if successful.
This might be able to reconstruct the OEP in some cases like this, and would be fairly easy to implement.
Added in change f4de059.
Would you be able to have a look? As a note, I found during building this feature that most modules that have an entry point of 0 are actually DLL libraries that don't have an entry point specified (just exported functions). It is likely the malware you are looking at is a DLL and may not have ever had a defined entry point.
How to try it out:
- Download and build the latest binary.
- Build the clean hash database. It will now also create two new databases: "entrypoints.hashes" and "shortentrypoints.hashes". This involves two hashes (8 bytes at entrypoint, and hashing of between 30 and 100 opcodes disassembling from the entrypoint).
- Now when you dump from memory, if the OEP is 0 or invalid, it will attempt to reconstruct the entry-point based on these databases of known formats for entrypoints.
To debug it there are some flags you can enable "-v" in the command line to get it to log detailed information on if it tried to reconstruct the entrypoint and what it found. It should show something like:
INFO: Re-building entrypoint. Original entrypoint invalid: 0
INFO: Possible entrypoint found (weak): 1040
INFO: Possible entrypoint found (weak): 1045
INFO: Possible entrypoint found (weak): 1998
...
INFO: Possible entrypoint found (weak): 1ac10
INFO: Possible entrypoint found (strong): 1ac10
INFO: Possible entrypoint found (weak): 1afe0
INFO: Updated entrypoint to: 1ac10
A weak entrypoint is that the first 8 bytes matched a known entrypoint, this triggers a full disassembly to check the strong hash. It will use the first strong hash match as the entrypoint, and if no strong match is found it will use the first weak match.
You can use '-eprec' to get it to force reconstruct the entrypoint of every module it dumps. Really helpful for testing!
I can't seem to get the '-eprec' flag to work? It keeps saying "Failed to parse argument"
Got the flag working, but unfortunately Windows does not like the EP that was assigned. And I get different results each time the unpacked and restored EP executable is run:
It may just be that this method will be incapable of restoring the EP for this particular application.
EDIT: Exactly what determines how many times the EP should be attempted? It seemed like only 8 entry points were attempted, all weak, and the program decided to use the first one from the list?
Anything at all? Has the '-eprec' ever successfully worked on any samples?
Coming back after quite a while to see if anything has happened in the last year or so.
I realize it's been a very very long time, but I finally restored the OEP and IAT for my unpacked executable.
If anyone in the future finds this, "Process-Dump" is currently not compatible with ASProtect.