PCM show wrong upi ports number
montagetao opened this issue · 8 comments
I am using the branch tag 202405 and noticed that the UPI port numbers are incorrect. However, when using the branch tag 202211, the UPI port numbers display correctly.
I found that the UPI port numbers are derived from the function size_t getNumQPIPorts() const { return xpiPMUs.size(); }. However, since xpiPMUs.size() is encapsulated, I would like to inquire if it is possible to add support for displaying UPI ports for the Montage 6426Y.
test config: archer city 2*socket montage 6426Y
OS :Centos9
PCM:brach tag 202405
Thank you in advance!
branch tag 202405 shows two UPI Ports
branch tag 202211 shows Three UPI Ports
could you please try to run pcm with this environment variable set: PCM_NO_UPILL_DISCOVERY=1
e.g.
export PCM_NO_UPILL_DISCOVERY=1
pcm
do you then see two or three UPI ports?
there are three UPI ports after export PCM_NO_UPILL_DISCOVERY=1, could you please tell more info about it? is there a plan to fixed it? thank you rdementi
This is likely a BIOS issue with exposing PMU for UPI. Could you please collect this BIOS uncore PMU discovery dump:
export PCM_PRINT_UNCORE_PMU_DISCOVERY=1;
./pcm-memory -i=1 > uncore_pmu_dump.txt 2>&1
please attach/share uncore_pmu_dump.txt
uncore_pmu_dump.txt
please refer to attachment. one more question: why branch tag 202211 can identified three UPI ports with this bios ?
Thanks for the dump. The BIOS UPI table is corrupt, this is the reason. Could you please share the BIOS version (dmidecode tool shows it in the "BIOS Information" section)?
In 202211 version pcm did not rely on the BIOS discovery tables for UPI.
Could you please also try to set this env variable: PCM_USE_UNCORE_PERF=1 and see how many UPI links are detected?
BIOS Version: EGSDCRB1.SYS.0107.D52.2311070228
export PCM_USE_UNCORE_PERF=1 , there are three UPI links
Socket 0
Max UPI link 0 speed: 35.8 GBytes/second (16.0 GT/second)
Max UPI link 1 speed: 35.8 GBytes/second (16.0 GT/second)
Max UPI link 2 speed: 35.8 GBytes/second (16.0 GT/second)
Socket 1
Max UPI link 0 speed: 35.8 GBytes/second (16.0 GT/second)
Max UPI link 1 speed: 35.8 GBytes/second (16.0 GT/second)
Max UPI link 2 speed: 35.8 GBytes/second (16.0 GT/second)
Can you elaborate on the BIOS UPI table being corrupt? What interface does PCM use to communicate with the BIOS regarding this UPI table? Can the BIOS version be used to determine if the BIOS UPI table is corrupt?
thanks!
This mechanism is described here: https://cdrdv2.intel.com/v1/dl/getContent/642245
Linux perf driver relies on it: https://lwn.net/Articles/860811/
I realize the issue you have reported has been discovered and work-arounded in perf:
https://lore.kernel.org/lkml/20221129191023.936738-1-kan.liang@linux.intel.com/T/
https://www.suse.com/support/kb/doc/?id=000021138
I will push a work-around to pcm too.