MarioVilas/winappdbg

search_hexa does not work correctly

baderj opened this issue · 6 comments

The search_hexa, and search_bytes sometimes don't return the correct offset. For example:

pattern = "D8 E0 AF 5C 2F 10"
for address, match in process.search_hexa(pattern):
    read = process.read(address, len(pattern.split(" ")))
    print("memory at address", " ".join(["{:02X}".format(ord(_)) for _ in read]))
    print("search pattern", pattern)
    print("found match", " ".join(["{:02X}".format(ord(_)) for _ in match]))

might return:

('memory at address', '87 A0 00 00 00 8B')
('search pattern', 'D8 E0 AF 5C 2F 10')
('found match', 'D8 E0 AF 5C 2F 10')

The match is correct, but reading the process memory at the returned address shows that the address is wrong. The returned address isn't always false, in many cases the address is correct.

Is this happening with the latest stable release, or with the development version here in Github?

With both. The problem is in Search.search_process. Continuous blocks are merged into one buffer, but the search is performed while the buffer might still grow. Switching off:

if delta and address == prev_addr:
    buffer += read(process, address, page)

eliminates the bug, but then of course searches can't span more than one block. The easiest fix would probably be to first determine the buffer by merging adjacent blocks, and then perform the search.

I just saw that you changed the routine by removing "delta". The commit message says that search is fixed. However, my tests still fail from time to time.

Are you still working on the issue? Otherwise I'll take a deeper look to provide more information.

Merging adjacent blocks would be bad if the total amount of memory to use is too large - that's what the algorithm was trying to avoid. I didn't get it to fail in my tests, can you provide an example so I can reproduce it?

Also, the pfind.py script does not have this bug and it contains the same algorithm, but it's a slightly different implementation. I'm not sure why one fails and the other one doesn't but it may provide some clues.

Junch commented

I meet the same issue in both the version 1.5 and the lasted version in github.

In my small program I just define a global string as "hello world", run the program and attach the small debugger to search the string.

    bytes = "hello world"
    for address in process.search_bytes(bytes):
        print(HexDump.address(address))

    hexa = "68656C6C6F20776F726C64" # "hello world"
    for address in process.search_hexa(hexa):
        print(HexDump.address(address))

The search_bytes works well but the search_hexa will cause an exception as below:

Traceback (most recent call last):
  File "lesson1.py", line 51, in <module>
    print(HexDump.address(address))
  File "C:\Python27\lib\site-packages\winappdbg\textio.py", line 525, in address
    return ('%%.%dX' % address_size) % address
TypeError: not all arguments converted during string formatting

The search_hexa function returns tuples, not addresses. The search_bytes returns addresses only. That's why your example is failing - it crashes when converting the address to hexa, not when searching, as you can tell from the stack trace.