[BUG] ][LOGMONITOR] ERROR: Failed to query next event. Error: 21.
israelvaldez opened this issue · 13 comments
Describe the bug
Suddenly some of my containers present the following error, which leaves them hanging on forever in kubernetes.
To Reproduce
Steps to reproduce the behavior:
- Run a container with the following configuration of entrypoint:
SHELL ["cmd", "/S", "/C"]
ENTRYPOINT ["cmd", "/S", "/C", "powershell","-command", "./Scripts/Run.ps1", "&&", "C:/LogMonitor/LogMonitor.exe", "ServiceMonitor.exe", "w3svc"]
- With a configuration for logmonitor.exe being the following:
{
"LogConfig": {
"sources": [{
"eventFormatMultiLine": false,
"channels": [{
"level": "error",
"name": "system"
}, {
"level": "Error",
"name": "application"
}
],
"type": "EventLog",
"startAtOldestRecord": true
}, {
"filter": "*.log",
"includeFileNames": true,
"directory": "c:\\inetpub\\logs",
"includeSubdirectories": true,
"type": "File"
}, {
"filter": "*.txt",
"includeFileNames": true,
"directory": "C:\\epi\\logs",
"includeSubdirectories": true,
"type": "File"
}
]
}
}
- Suddenly at some point the pod stops running with the following errors
2023-10-31T01:10:57.842902400Z [2023-10-31T01:10:57.000Z][LOGMONITOR] ERROR: Failed to query next event. Error: 21.
2023-10-31T01:10:57.842902400Z [2023-10-31T01:10:57.000Z][LOGMONITOR] ERROR: Failed to start event log monitor. Error: 21
2023-11-01T12:29:07.102387200Z [2023-11-01T12:29:07.000Z][LOGMONITOR] ERROR: Error monitoring log directory. Invalid directory change notification data.
2023-11-01T12:29:07.102467400Z [2023-11-01T12:29:07.000Z][LOGMONITOR] ERROR: Error monitoring log directory. Invalid directory change notification data.
Expected behavior
Logmonitor.exe should exit with code 1 if there's a catastrophic error or it should report something that we can act on to correct the issue.
Configuration
Specified in steps to reproduce
Additional context
I collected memory dumps of the process as it was hanging, if you need me to send them I can send it to you independently via email.
Hey @israelvaldez, which version of Log Monitor are you running?
Thanks, and which base image are you using?
mcr.microsoft.com/dotnet/framework/wcf:4.8-windowsservercore-ltsc2019
We'll try to repro this, we had a similar issue on the 2019 dotnet images: #150. It was fixed in version 2.0.2. This seems to be impacting file monitor as well which is strange.
Hiya @israelvaldez can you confirm if these errors appear on the latest version of Log Monitor that Ian pointed out. v2.0.2.
I tried repro it on my end and I did not see any regression.
===these are the logs I could observe
[2023-11-05T23:17:13.000Z][LOGMONITOR] INFO: Log directory \\?\C:\epi\logs does not exist. LogMonitor will wait for 300 seconds for the directory to be created.
[2023-11-05T23:17:13.000Z][LOGMONITOR] INFO: Waiting for system event channel to be enabled
[2023-11-05T23:17:28.000Z][LOGMONITOR] INFO: Enabled system event channel after 0 seconds.
{"Source": "EventLog","LogEntry": {"Time": "2023-10-10T19:35:59.000Z","Channel": "System","Level": "Error","EventId": 7023,"Message": "The Server service terminated with the following error: \r\nThe system cannot find the file specified."}}
{"Source": "EventLog","LogEntry": {"Time": "2023-10-10T19:35:59.000Z","Channel": "System","Level": "Error","EventId": 701,"Message": "Task Scheduler service failed to start Task Compatibility module. Tasks may not be able to register on previous Window versions. Additional Data: Error Value: 2147942450."}}
{"Source": "EventLog","LogEntry": {"Time": "2023-10-10T19:35:59.000Z","Channel": "System","Level": "Error","EventId": 701,"Message": "Task Scheduler service failed to start Task Compatibility module. Tasks may not be able to register on previous Window versions. Additional Data: Error Value: 2147942450."}}
{"Source": "EventLog","LogEntry": {"Time": "2023-10-10T19:53:34.000Z","Channel": "System","Level": "Error","EventId": 701,"Message": "Task Scheduler service failed to start Task Compatibility module. Tasks may not be able to register on previous Window versions. Additional Data: Error Value: 2147942450."}}
{"Source": "EventLog","LogEntry": {"Time": "2023-10-10T19:53:34.000Z","Channel": "System","Level": "Error","EventId": 701,"Message": "Task Scheduler service failed to start Task Compatibility module. Tasks may not be able to register on previous Window versions. Additional Data: Error Value: 2147942450."}}
{"Source": "EventLog","LogEntry": {"Time": "2023-10-10T19:53:34.000Z","Channel": "System","Level": "Error","EventId": 7023,"Message": "The Server service terminated with the following error: \r\nThe system cannot find the file specified."}}
{"Source": "EventLog","LogEntry": {"Time": "2023-10-10T19:55:53.000Z","Channel": "System","Level": "Error","EventId": 701,"Message": "Task Scheduler service failed to start Task Compatibility module. Tasks may not be able to register on previous Window versions. Additional Data: Error Value: 2147942450."}}
{"Source": "EventLog","LogEntry": {"Time": "2023-10-10T19:55:53.000Z","Channel": "System","Level": "Error","EventId": 701,"Message": "Task Scheduler service failed to start Task Compatibility module. Tasks may not be able to register on previous Window versions. Additional Data: Error Value: 2147942450."}}
{"Source": "EventLog","LogEntry": {"Time": "2023-10-10T19:55:53.000Z","Channel": "System","Level": "Error","EventId": 7023,"Message": "The LanmanServer service terminated with the following error: \r\nThe system cannot find the file specified."}}
{"Source": "EventLog","LogEntry": {"Time": "2023-10-10T19:56:28.000Z","Channel": "System","Level": "Error","EventId": 701,"Message": "Task Scheduler service failed to start Task Compatibility module. Tasks may not be able to register on previous Window versions. Additional Data: Error Value: 2147942450."}}
{"Source": "EventLog","LogEntry": {"Time": "2023-10-10T19:56:28.000Z","Channel": "System","Level": "Error","EventId": 701,"Message": "Task Scheduler service failed to start Task Compatibility module. Tasks may not be able to register on previous Window versions. Additional Data: Error Value: 2147942450."}}
{"Source": "EventLog","LogEntry": {"Time": "2023-10-10T19:56:28.000Z","Channel": "System","Level": "Error","EventId": 7023,"Message": "The LanmanServer service terminated with the following error: \r\nThe system cannot find the file specified."}}
I retained the JsonConfig file entries you shared and this is the context of my docker file to keep things simple
FROM mcr.microsoft.com/dotnet/framework/wcf:4.8-windowsservercore-ltsc2019
WORKDIR /LogMonitor
COPY LogMonitorConfig.json .
COPY LogMonitor.exe .
ENTRYPOINT ["C:\\LogMonitor\\LogMonitor.exe", "C:\\ServiceMonitor.exe", "w3svc"]
Hi @israelvaldez, would you be able to send the memory dumps collected to WindowsContainerGitHubIssues@service.microsoft.com?
Sent.
Seems WindowsContainerGitHubIssues@service.microsoft.com is not setup correctly as I recieved a message from postmaster@microsoft.com saying that address isn't setup to recieve messages from my email address.
Your message to the Microsoft 365 group WindowsContainerGitHubIssues@service.microsoft.com couldn't be delivered.
The group WindowsContainerGitHubIssues isn't set up to receive messages from .
Office 365 WindowsContainerGitH. . .
Sender Action Required
Sender not allowed
Would you be able to try again? There was a policy preventing external emails but that's fixed now.
Hello @israelvaldez, we are working on how to handle catastrophic failures in Log Monitor. We've not been able to repro the issue on our end. Could you please share your Docker file? Thanks!
@bobsira I can share the content of the powershell scripts via email so to not share in here and try to keep the NDA my company has with Microsoft, can you please let me know an email where I can send the files?
Today we also got a similar error that ultimately caused that one of our pods to be killed after hours of the following messages in the logs, are these raised on this issue?
[2023-12-07T19:07:32.000Z][LOGMONITOR] ERROR: Failed to query next event. Error: 21.
[2023-12-07T19:07:32.000Z][LOGMONITOR] ERROR: Failed to start event log monitor. Error: 21
[2023-12-07T14:24:56.000Z][LOGMONITOR] ERROR: Error monitoring log directory. Invalid directory change notification data.
[2023-12-07T13:05:29.000Z][LOGMONITOR] ERROR: Error monitoring log directory. Invalid directory change notification data.
[2023-12-07T11:47:31.000Z][LOGMONITOR] ERROR: Error monitoring log directory. Invalid directory change notification data.
[2023-12-07T10:27:48.000Z][LOGMONITOR] ERROR: Error monitoring log directory. Invalid directory change notification data.
[2023-12-07T09:10:05.000Z][LOGMONITOR] ERROR: Error monitoring log directory. Invalid directory change notification data.
@israelvaldez please send the files to WindowsContainerGitHubIssues@service.microsoft.com