EpicGamesExt/raddebugger

Step over not working in v0.9.9

Closed this issue ยท 20 comments

Step over is working totally fine in v0.9.8

0.9.8.mp4

While in v0.9.9, the step over doesn't seem to be working, sometimes the target is stuck and status in raddbg says running.

0.9.9.mp4

Tried deleting all the RADDBGI files, no luck.

It's not the for loop.
It's doing this in my hobby project and not able to repro it on work project.

Here is the project in case: https://github.com/MohitSethi99/ArcGameEngine
Branch: dx12
Setup: scripts/GenerateSolution.bat
PS: It will require dotnet 7 components from VS installer

Does this look completely deterministic to you? In other words, is it always failing the step operation at the same locations? I think this may have to do with the process memory cache not updating in time for the step operation, and the frontend not having any kind of retry mechanism.

Yes, always failing at the same location.

Since this is a hobby project, can you just send me a build? I am running into issues almost ASAP getting it set up, and I think it would just be faster to have a prebuilt binary I can test with locally.

Definitely, will send it as soon as I reach home. (1~2hrs)

Here you go: https://drive.google.com/file/d/1lCLeUtlpSvuQvlkNH1gyEkV4jfPE2d4j/view?usp=sharing

Exe: Arc-Editor.exe
Upon launch it will ask for project to open, simply clicking cancel should open the editor with no project loaded.
A good place to check out is while loop in ArcEngine::Application::Run()

Thanks! I'll take a look.

I was trying to figure out the behaviour.

Every time I step into the functions where It's bugging to step over (in the above videos) it's trying to take me to the wrong address (see callstack)

StepInto.mp4

Also, it works (step over and step into) if I focus the disassembly view (Not really sure if that's helpful)

Disass.mp4

Turns out remedybg is behaving the same for "step into" and not taking to the correct location. I'll investigate further and will update here.

The issue is happening when I'm calling a function in a dll. specifically nethost.dll I use to enable scripting backend in my engine. It allows to interop with C#. More details here: https://learn.microsoft.com/en-us/dotnet/core/tutorials/netcore-hosting
Will create a small example by the weekend.
Weird that it was working fine in 0.9.8 though.

Here is the visual studio solution for the minimal reproduction: https://drive.google.com/file/d/1rFZjYVJ2vn9ABg3JkbCZ7uOAJgL4RElt/view?usp=sharing

Zip is bloated cos of dotnet files. output directory has "App.runtimeconfig.json" config file and ".dotnet" folder which contains the dotnet framework that will be loaded by the code in Main.cpp

image
According to Visual Studio output, an exception is thrown (presumably in dotnet library in KernelBase.dll) when Line 39 is executed. Rest of the code execution is fine in VS and remedybg. But raddbg's step_over and step_into breaks after Line 40 hinting at some issue with dealing with exceptions in modules.

Hope this is helpful!

Issue is observed after this particular exception:
Exception thrown at 0x00007FFE0252543C (KernelBase.dll) in Arc-Editor.exe: 0x04242420 (parameters: 0x0000000031415927, 0x00007FFD20B90000, 0x0000000C5270E670).

image

Also noticed that this is written to the RadDebugger's output view:

onecore\vm\dv\storage\plan9\rdr\dll\util.cpp(99)\p9np.dll!00007FF9B890F0CC: (caller: 00007FF9B89093B0) LogHr(1) tid(1bf8) C0000034     Msg:[?????????????????????????????????????????????????????/??????????????????????????????????????????????)] 

Edit:
Sorry for this, it has nothing to do with this bug, it's writing above line everytime GetOpenFileNameA is called.

Sorry, I've been a bit swamped with other stuff. Can you somehow resend that minimal repro to me? Google removed it due to "terms of service violations"... sigh...

I think #106 is related, not sure though.

I think probably not, but not clear... Just fixed that one again in 489ae56, digging into this one now.

Well, that took way longer than I expected, and it took way longer than it deserved! This was being caused by an incredibly stupid state machine bug I introduced during a pass over the Demon layer between 0.9.8 and 0.9.9, and it was difficult to locate. This should be fixed in 4358778.

Thanks a lot! It works now :)