DPC_WATCHDOG_VIOLATION from Shadow Hooks
frostiest opened this issue · 8 comments
I'm dealing with a similar issue to this thread #26
as in security software is freezing my system on load and eventually issuing a DPC_WATCHDOG_VIOLATION. I applied all the changes from that thread but nothing is helping. I have confirmed the issue only arises when I hook something, otherwise no freeze.
I'm using latest ddimon and here's the memory.dmp
`
DPC_WATCHDOG_VIOLATION (133)
The DPC watchdog detected a prolonged run time at an IRQL of DISPATCH_LEVEL
or above.
Arguments:
Arg1: 0000000000000000, A single DPC or ISR exceeded its time allotment. The offending
component can usually be identified with a stack trace.
Arg2: 0000000000000501, The DPC time count (in ticks).
Arg3: 0000000000000500, The DPC time allotment (in ticks).
Arg4: fffff8045cd73358, cast to nt!DPC_WATCHDOG_GLOBAL_TRIAGE_BLOCK, which contains
additional information regarding this single DPC timeout
Debugging Details:
*** Either you specified an unqualified symbol, or your debugger ***
*** doesn't have full symbol information. Unqualified symbol ***
*** resolution is turned off by default. Please either specify a ***
*** fully qualified symbol module!symbolname, or enable resolution ***
*** of unqualified symbols by typing ".symopt- 100". Note that ***
*** enabling unqualified symbol resolution with network symbol ***
*** server shares in the symbol path may cause the debugger to ***
*** appear to hang for long periods of time when an incorrect ***
*** symbol name is typed or the network symbol server is down. ***
*** For some commands to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** Type referenced: TickPeriods ***
KEY_VALUES_STRING: 1
PROCESSES_ANALYSIS: 1
SERVICE_ANALYSIS: 1
STACKHASH_ANALYSIS: 1
TIMELINE_ANALYSIS: 1
DUMP_CLASS: 1
DUMP_QUALIFIER: 402
BUILD_VERSION_STRING:
SYSTEM_PRODUCT_NAME: To Be Filled By O.E.M.
SYSTEM_SKU: To Be Filled By O.E.M.
SYSTEM_VERSION: To Be Filled By O.E.M.
BIOS_VENDOR: American Megatrends Inc.
BIOS_VERSION: P3.10
BIOS_DATE: 07/04/2018
BASEBOARD_MANUFACTURER:
BASEBOARD_PRODUCT:
BASEBOARD_VERSION:
DUMP_TYPE: 0
BUGCHECK_P1: 0
BUGCHECK_P2: 501
BUGCHECK_P3: 500
BUGCHECK_P4: fffff8045cd73358
DPC_TIMEOUT_TYPE: SINGLE_DPC_TIMEOUT_EXCEEDED
CPU_COUNT: 6
CPU_MHZ: e10
CPU_VENDOR: GenuineIntel
CPU_FAMILY: 6
CPU_MODEL: 9e
CPU_STEPPING: a
CPU_MICROCODE: 6,9e,a,0 (F,M,S,R) SIG: 96'00000000 (cache) 96'00000000 (init)
BLACKBOXBSD: 1 (!blackboxbsd)
BLACKBOXNTFS: 1 (!blackboxntfs)
BLACKBOXWINLOGON: 1
DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT
BUGCHECK_STR: 0x133
PROCESS_NAME: System
CURRENT_IRQL: d
ANALYSIS_SESSION_HOST:
ANALYSIS_SESSION_TIME: 12-15-2019 22:05:37.0391
ANALYSIS_VERSION: 10.0.18362.1 amd64fre
LAST_CONTROL_TRANSFER: from fffff8045c9ee83d to fffff8045c9c1220
STACK_TEXT:
ffffbc01879bcb08 fffff804
5c9ee83d : 0000000000000133 00000000
00000000 0000000000000501 00000000
00000500 : nt!KeBugCheckEx
ffffbc01879bcb10 fffff804
5c81f857 : 0000014240a3c9a1 ffffbc01
879c0180 0000000000000286 fffffd0a
80837a10 : nt!KeAccumulateTicks+0x1cbe1d
ffffbc01879bcb70 fffff804
5d2b91e1 : 0000000000000000 ffffd108
652d4400 fffffd0a80837a90 ffffd108
652d44b0 : nt!KeClockInterruptNotify+0xc07
ffffbc01879bcf30 fffff804
5c8029e5 : ffffd108652d4400 00000000
00000000 0000000000000000 ffff1d2c
7670378f : hal!HalpTimerClockIpiRoutine+0x21
ffffbc01879bcf60 fffff804
5c9c2cba : fffffd0a80837a90 ffffd108
652d4400 000000000000fffe ffffd108
652d4400 : nt!KiCallInterruptServiceRoutine+0xa5
ffffbc01879bcfb0 fffff804
5c9c3227 : ffffbc0188259ef0 ffffbc01
88259ef0 000000000000002a 00000000
00000000 : nt!KiInterruptSubDispatchNoLockNoEtw+0xfa
fffffd0a80837a10 fffff804
5c81b9eb : ffffffffffffffd2 fffff804
5cdab573 0000000000000010 00000000
00000286 : nt!KiInterruptDispatchNoLockNoEtw+0x37
fffffd0a80837ba0 fffff804
5cdab585 : ffffd108654fdfa0 ffffd108
653f7840 0000000000000010 ffffd108
654fd000 : nt!KeYieldProcessorEx+0x1b
fffffd0a80837bd0 fffff804
5cdaa6e5 : 000000001eb9414f ffffbc01
879c0180 ffffd10879a92a90 fffffd0a
80837c90 : nt!IopLiveDumpProcessCorralStateChange+0x2d
fffffd0a80837c00 fffff804
5c86ae85 : ffffbc01879c2f80 ffffd108
653f6000 ffffd108673cf260 ffffbc01
00000002 : nt!IopLiveDumpCorralDpc+0x55
fffffd0a80837c40 fffff804
5c86a4df : ffffbc01879c0180 00000000
00000000 0000000000000002 00000000
00000004 : nt!KiExecuteAllDpcs+0x305
fffffd0a80837d80 fffff804
5c9c8265 : 0c45b60f450d55b6 ffffbc01
879c0180 0000000000000000 00000000
c0000002 : nt!KiRetireDpcList+0x1ef
fffffd0a80837fb0 fffff804
5c9c8050 : 0000000000000000 ffffd108
672171e0 0000000000000000 fffff804
5cc68c80 : nt!KxRetireDpcList+0x5
fffffd0a87966b20 fffff804
5c9c7720 : 0000000000000000 fffffd0a
87966bd0 ffffd108652d4400 00000000
00000000 : nt!KiDispatchInterruptContinue
fffffd0a87966b50 fffff804
5c901064 : ffffd108766c3550 fffff804
5e4cc089 ffffa98c2f402a80 fffff804
5c8c2cfc : nt!KiDpcInterrupt+0x2f0
fffffd0a87966ce0 fffff804
5c8c12ae : fffff8045cc68c80 00000000
00000002 ffffbc0187dc7180 00000000
00001000 : nt!ExpWaitForSpinLockSharedAndAcquire+0x64
fffffd0a87966d20 fffff804
5c855d97 : fffff8045cc68bc0 ffff9bfc
02391008 ffffd108766c3601 fffff804
5cc68bc0 : nt!MiLockWorkingSetShared+0xee
fffffd0a87966d50 fffff804
5ced3bb5 : fffff804722000e0 fffff804
722000e0 000000000000000c 00000000
00000fff : nt!MiLockCode+0x147
fffffd0a87966ef0 fffff804
7220614c : ffffd108766c3550 fffffd0a
879670e0 ffffd108673afa40 ffffd108
673afa40 : nt!MmResetDriverPaging+0xa5
fffffd0a87966f20 fffff804
72206037 : 00000000656c6946 fffff804
5cb6f06d fffffd0a879670e0 00000000
00000000 : Msfs!MsCommonCreate+0xdc
fffffd0a87966ff0 fffff804
5c831f39 : ffffd10865fe3010 ffffd108
766c3550 0000000000000000 00000000
00000000 : Msfs!MsFsdCreate+0x27
fffffd0a87967020 fffff804
5e4fce8c : 0000000000000010 00000000
00000000 0000000000000000 00000000
00000000 : nt!IofCallDriver+0x59
fffffd0a87967060 fffff804
5c831f39 : ffffd1087d5a8100 fffff804
5cde590c 0000000000000000 00000000
00000030 : FLTMGR!FltpCreate+0x46c
fffffd0a87967110 fffff804
5c830fe4 : 0000000000000000 00000000
00000000 ffffd108766c36f8 fffff804
5c8317a3 : nt!IofCallDriver+0x59
fffffd0a87967150 fffff804
5cde5ffb : fffffd0a87967410 fffff804
5cde590c fffffd0a87967380 ffffd108
7aa4a4e0 : nt!IoCallDriverWithTracing+0x34
fffffd0a879671a0 fffff804
5cdecfcf : ffffd108673af8f0 ffffd108
673af80c ffffd10877eaf7e0 ffffa98c
2fa04000 : nt!IopParseDevice+0x62b
fffffd0a87967310 fffff804
5cdeb431 : ffffd10877eaf700 fffffd0a
87967558 0000000000000240 ffffd108
652f7380 : nt!ObpLookupObjectName+0x78f
fffffd0a879674d0 fffff804
5ce30300 : ffffd10800000001 fffffd0a
87967998 0000000000000000 ffffbc01
87dc7180 : nt!ObOpenObjectByNameEx+0x201
fffffd0a87967610 fffff804
5ce2fa38 : fffffd0a879679e0 00000000
00000080 fffffd0a87967998 fffffd0a
87967988 : nt!IopCreateFile+0x820
fffffd0a879676b0 fffff804
5c9d2b15 : ffffd108766c3550 00000000
00000002 0000000000000170 00000000
00000001 : nt!NtOpenFile+0x58
fffffd0a87967740 fffff804
5c9c5060 : fffff8045ce447c4 00000000
00000000 0000000000000000 00000000
00040282 : nt!KiSystemServiceCopyEnd+0x25
fffffd0a87967948 fffff804
5ce447c4 : 0000000000000000 00000000
00000000 0000000000040282 fffff804
5c869d59 : nt!KiServiceLinkage
fffffd0a87967950 fffff804
5e503b6c : 0000000000000000 00000000
00000000 0000000000000030 fffff804
5e4ea800 : nt!IoGetDeviceObjectPointer+0x94
fffffd0a879679e0 fffff804
5c8bd465 : ffffd108652a0500 ffffd108
78185040 fffff8045e503aa0 ffffd108
652a0500 : FLTMGR!FltpManualDeviceAttachWorker+0xcc
fffffd0a87967a70 fffff804
5c92a725 : ffffd10878185040 00000000
00000080 ffffd10865294380 000024ef
bd9bbfff : nt!ExpWorkerThread+0x105
fffffd0a87967b10 fffff804
5c9c886a : ffffbc0187e79180 ffffd108
78185040 fffff8045c92a6d0 00000000
00000000 : nt!PspSystemThreadStartup+0x55
fffffd0a87967b60 00000000
00000000 : fffffd0a87968000 fffffd0a
87961000 0000000000000000 00000000
00000000 : nt!KiStartSystemThread+0x2a
THREAD_SHA1_HASH_MOD_FUNC: 20df6024a5c03bdd36d1680f5e2bf77cf839246c
THREAD_SHA1_HASH_MOD_FUNC_OFFSET: 34208103f2bc76fcde0bfeae8b7994e947ea59f9
THREAD_SHA1_HASH_MOD: 927a0c775c0b5510d0e1cbb5ae141a64633523eb
FOLLOWUP_IP:
Msfs!MsCommonCreate+dc
fffff804`7220614c 33d2 xor edx,edx
FAULT_INSTR_CODE: 8d48d233
SYMBOL_STACK_INDEX: 13
SYMBOL_NAME: Msfs!MsCommonCreate+dc
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: Msfs
IMAGE_NAME: Msfs.SYS
DEBUG_FLR_IMAGE_TIMESTAMP: 0
STACK_COMMAND: .thread ; .cxr ; kb
BUCKET_ID_FUNC_OFFSET: dc
FAILURE_BUCKET_ID: 0x133_DPC_Msfs!MsCommonCreate
BUCKET_ID: 0x133_DPC_Msfs!MsCommonCreate
PRIMARY_PROBLEM_CLASS: 0x133_DPC_Msfs!MsCommonCreate
TARGET_TIME: 2019-12-14T05:09:53.000Z
OSBUILD: 18362
OSSERVICEPACK: 0
SERVICEPACK_NUMBER: 0
OS_REVISION: 0
SUITE_MASK: 272
PRODUCT_TYPE: 1
OSPLATFORM_TYPE: x64
OSNAME: Windows 10
OSEDITION: Windows 10 WinNt TerminalServer SingleUserTS
OS_LOCALE:
USER_LCID: 0
OSBUILD_TIMESTAMP: unknown_date
BUILDDATESTAMP_STR: 190318-1202
BUILDLAB_STR: 19h1_release
BUILDOSVER_STR: 10.0.18362.1.amd64fre.19h1_release.190318-1202
ANALYSIS_SESSION_ELAPSED_TIME: 19b1
ANALYSIS_SOURCE: KM
FAILURE_ID_HASH_STRING: km:0x133_dpc_msfs!mscommoncreate
FAILURE_ID_HASH: {349b4bae-9c52-cd03-915a-31077bf2ad14}
Followup: MachineOwner
`
I been at this a few days so any help would be great
Can you tell me hooking what API causes the issue? I may be able to list few possible causes.
It appears to be any API, doesn't seem to matter which. As I said though if I don't have anything hooked but ddimon/hyperplatform is loaded, there's no freeze. I'm going to continue testing to see if maybe its finding the int 3 at the start of a shadow page but outside of that I don't have any ideas.
Understood. What is the security software you use? Some of them have their own hypervisors and emulates VT-x, but can still cause compatibility issues like that.
If your security software has such a feature and disabling it is a solution, try that out. If your need to understand the root cause, uploading MEMORY.DMP with PDB would be helpful, though I may not be able to take time for much analysis.
If it's all the same to you i'd rather not list the security software, but I can tell you with certainty they don't use a hypervisor because I tested the hypervisor listed here #26 yesterday, hvpp and it appears to work ok. I really rather use your software though, if possible.
Is there any ideas you have I can try or do you need to debug?
A few things i've noticed during testing
-
It doesn't care at all about int 3 on the shadow page, if I call ShpSetupInlineHook and not ShEnableHooks it loads ok. And on the same token if I comment out the int 3 but still call ShEnableHooks it freezes.
-
I enabled kVmmpEnableRecordVmExit but nothing is logged once it freezes, or even right before.
-
I tested on my main computer as well as in vmware. With vmware I used windbg attached ,and the guest system forever freezes once the security driver loads, the debugger doesn't break, and there's no bluescreen.
-
I read in the thread I linked that there's not many difference between hvpp and this project, and hvpp seems to load fine, so whatever is causing this I think maybe targeted at hyperplatform specifically.
I'll keep trying things and share if i find an answer, but at this point am pretty stumped.
Hi, thanks for sharing your experiments and results. I would try studying more about hvpp and see what are possible key differences, and also, try VMware + GDB stub debugging. This would let you to see where is freezing. It is often does not uncover that's wrong but still gives some more information.
https://www.triplefault.io/2017/07/setup-vmm-debugging-using-vmwares-gdb_9.html
Hey tandasat, good news, I finally confirmed the cause. The issue was similar to the other thread, with MmCopyMemory being called for large regions, a lot of times, and being so slow that it causes the DPC_WATCHDOG violation. For this security software though, simply removing the use of MTF for the shadow hooks wasn't enough, I had to hook MmCopyMemory and modify its results, and finally, finally the software seems to load ok. If you know of any other ways to speed that up, that'd be great, but until then i'm going to go ahead and close this thread, thx for the help.
Thank you for sharing your findings. This should help others having similar issue narrow down the cause by themselves.
Apart from not using MTF (IIRC, like how hvpp does it), replacing 0xcc with actual jmp instructions for hooking might help, as it reduces performance penalty associated with extra VM-exit. Relevant discussion can be found here:
tandasat/SimpleSvmHook#1
I do not have performance metrics so do not know if this is the right spot to invest optimization effort for though.