goldshtn/msos

Unable to run !ClrStack -a because of duplicite stack frames

stej opened this issue · 5 comments

stej commented

I'm running !ClrStack -a and I get this error message for some threads

Exception during command execution -- InvalidOperationException: 'Sequence contains more than one matching element'
   at System.Linq.Enumerable.SingleOrDefault[TSource](IEnumerable`1 source, Func`2 predicate)
   at msos.ClrThreadExtensions.FrameArgumentsAndLocalsRetriever.ProcessFrame(IXCLRDataStackWalk stackWalk) in C:\dev\github\msos\msos\ClrThreadExtensions.cs:line 288
   at msos.ClrThreadExtensions.FrameArgumentsAndLocalsRetriever.ProcessStackWalk(UInt32 osThreadId) in C:\dev\github\msos\msos\ClrThreadExtensions.cs:line 216
   at msos.ClrThreadExtensions.FrameArgumentsAndLocalsRetriever..ctor(ClrThread thread, IList`1 stackTrace, CommandExecutionContext context) in C:\dev\github\msos\msos\ClrThreadExtensions.cs:line 200
   at msos.ClrThreadExtensions.WriteStackTraceToContext(ClrThread thread, IList`1 stackTrace, CommandExecutionContext context, Boolean displayArgumentsAndLocals) in C:\dev\github\msos\msos\ClrThreadExtensions.cs:line 64
   at msos.ClrThreadExtensions.WriteCurrentStackTraceToContext(ClrThread thread, CommandExecutionContext context, Boolean displayArgumentsAndLocals) in C:\dev\github\msos\msos\ClrThreadExtensions.cs:line 54
   at msos.CLRStack.Execute(CommandExecutionContext context) in C:\dev\github\msos\msos\CLRStack.cs:line 31
   at msos.CommandExecutionContext.ExecuteOneCommand(String command) in C:\dev\github\msos\msos\CommandExecutionContext.cs:line 122
Proceed at your own risk, or restart the debugging session.

The callstack I'm analyzing looks like this:

 -> HelperMethodFrame_1OBJ
 -> System.Threading.WaitHandle.InternalWaitOne(System.Runtime.InteropServices.SafeHandle, Int64, Boolean, Boolean)
 -> System.Threading.WaitHandle.WaitOne(System.TimeSpan, Boolean)
 -> System.Threading.WaitHandle.WaitOne(System.TimeSpan)
 -> XY.Z.FollowerEventListener.<StartHeartBeatMonitor>b__17_0()
 -> System.Threading.Tasks.Task.Execute()
 -> System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
 -> System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
 -> System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)
 -> System.Threading.Tasks.Task.ExecuteEntry(Boolean)
 -> System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
 -> System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
 -> System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
 -> System.Threading.ThreadHelper.ThreadStart(System.Object)
 -> GCFrame

The code crashes on inspeciting the frame System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
on this line var matchingFrame = _stackTrace.SingleOrDefault(f => f.DisplayString == frameArgsLocals.MethodName); (see ClrThreadExtensions, method ProcessFrame, my line numbers are a little bit different from the ones on github).

Possible solutions

  • replace the SingleOrDefault with Where and ignore cases where count is != 1
  • pass frame index into ProcessFrame and use this index like this var matchingFrame = _stackTrace[index] - this looks obvious, so there was probably a catch, right?

Thanks for reporting this!

Well, the thing is that we don't have a frame index -- the frames are enumerated in that case by using the IXCLRDataStackWalk::Next enumerator method, and I don't really have any guarantee that these frames will exactly agree with the frames passed to the constructor as a IList<ClrStackFrame>; although they probably will agree in most cases.

I suppose we could resort to passing the frame index but verifying that the display string matches the method name we get from the IXCLR interface. Otherwise, we could end up with totally mismatched and bogus locals/arguments.

/cc @Pavel-Durov

stej commented

@Pavel-Durov is somehow responsible for the code we are talking about?
Are you open for pull requests? I might try to implement what you suggested in previous comment.

@Pavel-Durov is a contributor who helped with some other features. I implemented this specific part, but I am more than open to pull requests -- if you have the time to tackle this, it would be most appreciated.

stej commented

Looking at the issue. I made a sample program and made a dump of that program. I ran windbg on that dump and looked at the stack. Windbg shows this:

0:000> !clrstack
OS Thread Id: 0x6fa8 (0)
        Child SP               IP Call Site
000000104f10e428 00007ffe57856194 [InlinedCallFrame: 000000104f10e428] Microsoft.Win32.Win32Native.ReadFile(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte*, Int32, Int32 ByRef, IntPtr)
000000104f10e428 00007ffe388ea5b4 [InlinedCallFrame: 000000104f10e428] Microsoft.Win32.Win32Native.ReadFile(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte*, Int32, Int32 ByRef, IntPtr)
000000104f10e3f0 00007ffe388ea5b4 DomainNeutralILStubClass.IL_STUB_PInvoke(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte*, Int32, Int32 ByRef, IntPtr)
000000104f10e4d0 00007ffe390feb81 System.IO.__ConsoleStream.ReadFileNative(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte[], Int32, Int32, Boolean, Boolean, Int32 ByRef)
000000104f10e560 00007ffe390fea36 System.IO.__ConsoleStream.Read(Byte[], Int32, Int32)
000000104f10e5c0 00007ffe388c7fec System.IO.StreamReader.ReadBuffer()
000000104f10e610 00007ffe388c7d6d System.IO.StreamReader.ReadLine()
000000104f10e680 00007ffe392c931f System.IO.TextReader+SyncTextReader.ReadLine()
000000104f10e6e0 00007ffe39097a77 System.Console.ReadLine()
000000104f10e710 00007ffdda8006be dupstack.Program.Multi() [C:\dev\github\msos-stej\dupStackRepro\dupstack\dupstack\Program.cs @ 26]
000000104f10e7c0 00007ffdda8006de dupstack.Program.Multi() [C:\dev\github\msos-stej\dupStackRepro\dupstack\dupstack\Program.cs @ 30]
000000104f10e870 00007ffdda8006de dupstack.Program.Multi() [C:\dev\github\msos-stej\dupStackRepro\dupstack\dupstack\Program.cs @ 30]
000000104f10e920 00007ffdda8006de dupstack.Program.Multi() [C:\dev\github\msos-stej\dupStackRepro\dupstack\dupstack\Program.cs @ 30]
000000104f10e9d0 00007ffdda8006de dupstack.Program.Multi() [C:\dev\github\msos-stej\dupStackRepro\dupstack\dupstack\Program.cs @ 30]
000000104f10ea80 00007ffdda80050d dupstack.Program.Main(System.String[]) [C:\dev\github\msos-stej\dupStackRepro\dupstack\dupstack\Program.cs @ 15]
000000104f10ed30 00007ffe39df6573 [GCFrame: 000000104f10ed30] 
stej commented

The test program is this:
Program.zip
Very simple, the vars there are just for testing.

Modified the code so that it takes zerobased stackFrameIndex like this

            private void ProcessStackWalk(uint osThreadId)
            {
....
                int idx = 0;
                while (HR.S_OK == stackWalk.Next())
                {
                    ProcessFrame(stackWalk, idx);
                    idx++;
                }
            }
......
            private void ProcessFrame(IXCLRDataStackWalk stackWalk, int stackFrameIndex)
            {
                object tmp;

                if (HR.Failed(stackWalk.GetFrame(out tmp)))
                    return;

                IXCLRDataFrame frame = (IXCLRDataFrame)tmp;

                StringBuilder methodName = new StringBuilder(MaxNameSize);
                uint methodNameLen;
                if (HR.Failed(frame.GetCodeName(0 /*default flags*/, (uint)methodName.Capacity, out methodNameLen, methodName)))
                    return;
                Console.WriteLine($"by frame: {methodName}");
                Console.WriteLine($"by index: {_stackTrace[stackFrameIndex]}");
.......
}

and the result is interesting.
(I copied the output to console to paste it here)

When using the frame (COM object), I get

by frame: DomainNeutralILStubClass.IL_STUB_PInvoke(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte*, Int32, Int32 ByRef, IntPtr)
by frame: DomainNeutralILStubClass.IL_STUB_PInvoke(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte*, Int32, Int32 ByRef, IntPtr)
by frame: System.IO.__ConsoleStream.ReadFileNative(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte[], Int32, Int32, Boolean, Boolean, Int32 ByRef)
by frame: System.IO.__ConsoleStream.Read(Byte[], Int32, Int32)
by frame: System.IO.StreamReader.ReadBuffer()
by frame: System.IO.StreamReader.ReadLine()
by frame: System.IO.TextReader+SyncTextReader.ReadLine()
by frame: System.Console.ReadLine()
by frame: dupstack.Program.Multi()
by frame: dupstack.Program.Multi()
by frame: dupstack.Program.Multi()
by frame: dupstack.Program.Multi()
by frame: dupstack.Program.Multi()
by frame: dupstack.Program.Main(System.String[])

There is duplicated DomainNeutralILStubClass.IL_STUB_PInvoke. Why?

When I use the index approach, I get

by index: [InlinedCallFrame] (Microsoft.Win32.Win32Native.ReadFile)
by index: [InlinedCallFrame] (Microsoft.Win32.Win32Native.ReadFile)
by index: DomainNeutralILStubClass.IL_STUB_PInvoke(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte*, Int32, Int32 ByRef, IntPtr)
by index: System.IO.__ConsoleStream.ReadFileNative(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte[], Int32, Int32, Boolean, Boolean, Int32 ByRef)
by index: System.IO.__ConsoleStream.Read(Byte[], Int32, Int32)
by index: System.IO.StreamReader.ReadBuffer()
by index: System.IO.StreamReader.ReadLine()
by index: System.IO.TextReader+SyncTextReader.ReadLine()
by index: System.Console.ReadLine()
by index: dupstack.Program.Multi()
by index: dupstack.Program.Multi()
by index: dupstack.Program.Multi()
by index: dupstack.Program.Multi()
by index: dupstack.Program.Multi()

So there are the inline stack frames that are not in the first listing (by frame).

Why there is the difference, is not very clear to me..