collect_stats tries to access sysfs
dcermak opened this issue · 5 comments
The collect_stats.py
script tries to query the current cpu frequency and governor from sysfs:
phoebe/scripts/collect_stats.py
Line 187 in 7e269fc
Unfortunately, this fails in the github actions with:
Traceback (most recent call last):
File "/__w/phoebe/phoebe/scripts/collect_stats.py", line 306, in <module>
main(sys.argv[1], settings, count)
File "/__w/phoebe/phoebe/scripts/collect_stats.py", line 271, in main
collect_stats(
File "/__w/phoebe/phoebe/scripts/collect_stats.py", line 187, in collect_stats
with open(SYSFS_CPU_PATH + 'cpu0/cpufreq/scaling_governor') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor'
I suspect that this is caused by the github actions runner not allowing the CI action to query these information to prevent it from modifying the CPU behavior.
@shunghsiyu I was told you introduced this, can we remove it for the meantime?
Yes, but I suspect removing just scaling_governor
may not be enough.
I think the main problem is the different kernel seems to expose different sysfs files/directories inside container. (Correct me if I'm wrong here)
Previously I use a work-around where I've maintain a set of sysfs entries SYSCTL_NOT_IN_CONTAINER
that I know is not presented in our previous CI runner's environment (on GitLab). Entries in SYSCTL_NOT_IN_CONTAINER
are added (painstakingly) through trail-by-errors, until it finally runs.
That wasn't a great work-around any way.
I think a better way forward is perhaps to detect that we're inside a container, and be more relaxed about missing sysfs entries if the script is running inside; using a value of 0 instead (or some other value, TBD). @mvarlese what do you think?
Frankly speaking, I don't see Phoebe being deployed into a container so I am not sure that running the .py script within a container and consider its results (whether pass or fail) pays off.
Okay, then let's remove it for the mean time, I'll open a PR.
Can't this issue be closed now?
I think so, the script now runs on the CI.