Backtrace from wrong CPU in kernel core dump
Closed this issue · 1 comments
For kernel core dumps, drgn initiates a backtrace for a task that was running at the time of the crash by getting the registers from an NT_PRSTATUS
note in the core dump. These notes are supposed to be indexed by CPU number. However, if the registers could not be saved for a given CPU, its note is omitted, which messes up the numbering for every CPU after it. This can happen in at least a couple of cases:
- If the CPU is offline (see #391).
- If the CPU was locked up and didn't respond to the crash NMI.
The first case could be detected by looking at the online CPU mask, but the second case can't easily be corrected. This means we can't rely on NT_PRSTATUS
. Instead, we probably have to look at the crash_notes
per-CPU variable, which is what ends up in NT_PRSTATUS
anyways.
There's a complication here: for core dumps not from an actual kernel crash (like QEMU's dump-guest-memory
), we still need to use NT_PRSTATUS
.