microsoft/DbgShell

Skinny Null Terminated String Functions Don't Allow Partial Reads

Zhentar opened this issue · 3 comments

The wide string functions work how I would expect, however the ascii/UTF8 string read functions simply throw exceptions with no partial results if the string is longer than the requested bytes.

Thank you for reporting the problem!

Can you clarify the exact names of the functions you are calling? And perhaps more info about the exception? Just to make sure the problem is what I think it is.

At first I thought that I would not expect there to be any difference between the skinny and wide string reading at the dbgeng/mem-reading level, because they both funnel through ReadMemAs_String. But perhaps if you were reading a UTF8 string, and it got chopped at an inopportune point, the Encoding object might be throwing. In that case, you could try calling the overload that allows you to pass the Encoding object, and construct one that has throwOnInvalidBytes as false.

But then I realized that there is another set of functions:

  1. One that funnels through ReadMemAs_szString, and those all call through dbgeng's IDebugDataSpaces.ReadMultiByteStringVirtualWide interface.
  2. Another that funnels through ReadMemAs_wszString, which used to call through IDebugDataSpaces.ReadUnicodeStringVirtualWide, but the comments indicate that I ran into problems with it (a problem similar to what you describe, even), and reimplemented the logic myself in terms of reading raw memory directly.

I'm guessing you ran into the difference between those two code paths (in which case the fix is probably to switch ReadMemAs_szString to also be implemented in terms of reading memory directly (IDebugDataSpaces.ReadVirtualDirect) (just like ReadMemAs_wszString).

The one I remember encountering it with most recently was ReadMemAs_putf8szString, so yeah, the ones that funnel through ReadMemAs_szString. When DbgEng read maxCch bytes without encountering a null terminator, it returned a buffer of maxCch length (which I presume contained the bytes of the string though I did not verify) and HR E_NOINTERFACE, which then becomes a DbgMemoryAccessException.

Okay, I tried an experiment, and the results from dbgeng are just wacky.

I dynamically allocated an ASCII string of length 1024 and filled it with stuff (with a null terminator at the end).

When calling $debugger.ReadMemAs_szString( $addr, 768 ), the dbgeng API returned E_INVALIDARG. But when I asked for one more byte, $debugger.ReadMemAs_szString( $addr, 769 ), it succeeded. (Those lengths in hex are 0x300 and 0x301.) I'm guessing that dbgeng internally reads a larger chunk than is asked for, and if it finds the null terminator anywhere in what it read, it considers things "fine", even if that null terminator is not going to be part of the [truncated] string that is returned to the caller.

I could file a bug on the dbgeng team... but even if they care, I don't know when they'll fix it; I think DbgShell will just need to implement its own string reading if it wants consistent, sensical behavior.