Azure/WALinuxAgent

[BUG] waagent -collect-logs doesn't work in RHEL-9/10(WALA 2.9.1.1) and the log is confusing

Closed this issue · 5 comments

Describe the bug: A clear and concise description of what the bug is.
In RHEL-10 the WALA CGroups feature cannot be enabled, so that the log collection is not allowed. This might be expected because of #2637 . But the error log is confusing when running "waagent -collect-logs":

# waagent -collect-logs
...
2024-07-09T03:07:55.659321Z ERROR MainThread LogCollector Log collection completed unsuccessfully. Error: [CGroupsException] Failed to read cpuacct.stat: expected str, bytes or os.PathLike object, not NoneType
...

Note: Please add some context which would help us understand the problem better

  1. In RHEL-9 or 10 with WALA v2.9.1.1 installed
  2. Run "waagent -collect-logs"

Distro and WALinuxAgent details (please complete the following information):

  • Distro and Version: RHEL-9/10
  • WALinuxAgent version:
    WALinuxAgent-2.9.1.1 running on rhel 10.0
    Python: 3.12.2
    Goal state agent: 2.9.1.1

Additional context

Log file attached

# waagent -collect-logs
2024-07-09T03:07:55.629123Z INFO MainThread LogCollector Running log collector mode normal
2024-07-09T03:07:55.630477Z INFO MainThread LogCollector WireServer endpoint 168.63.129.16 read from file
2024-07-09T03:07:55.630736Z INFO MainThread LogCollector Wire server endpoint:168.63.129.16
2024-07-09T03:07:55.630962Z INFO MainThread LogCollector Forcing an update of the goal state.
2024-07-09T03:07:55.640532Z INFO MainThread Fetched a new incarnation for the WireServer goal state [incarnation 1]
2024-07-09T03:07:55.641873Z INFO MainThread 
2024-07-09T03:07:55.642163Z INFO MainThread Fetching full goal state from the WireServer [incarnation 1]
2024-07-09T03:07:55.646676Z INFO MainThread Fetch goal state completed
2024-07-09T03:07:55.659321Z ERROR MainThread LogCollector Log collection completed unsuccessfully. Error: [CGroupsException] Failed to read cpuacct.stat: expected str, bytes or os.PathLike object, not NoneType
2024-07-09T03:07:55.659552Z INFO MainThread LogCollector Detailed log output can be found at /var/lib/waagent/logcollector/results.txt

@yuxisun1217 This is an issue in v2.9.1.1 of the agent (#2929). We added resource monitoring on collect-logs in v2.9.1.1 which broke the command line option.

The fix for this was released in versions 2.10.0.8+

This issue was fixed in 2.10.0.8+

This issue was fixed in 2.10.0.8+

@maddieford Can you point us to the specific set of patches? Maybe we can try to backport them.

#2929

This issue was fixed in 2.10.0.8+

@maddieford Can you point us to the specific set of patches? Maybe we can try to backport them.

Is it this one? #2939

This is the PR which fixed issue #2929:

#2939