Number of cores incorrectly reported?
Closed this issue · 11 comments
I think physical and logical cores are being reported incorrectly?
Julia-0.6.4> Sys.CPU_CORES
48
Julia-0.6.4> Hwloc.num_physical_cores()
24
Julia-0.6.4> CpuId.cpucores()
12
Julia-0.6.4> CpuId.cpuinfo()
Cpu Property Value
–––––––––––––––––– ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
Brand Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz
Vendor :Intel
Architecture :Broadwell
Model Family: 6, Model: 79, Stepping: 1, Type: 0
Cores 12 physical cores, 24 logical cores (on executing CPU)
Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 256, 30720) kbytes
64 byte cache line size
Address Size 48 bits virtual, 46 bits physical
SIMD 256 bit = 32 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via `rdtsc`
TSC runs at constant rate (invariant from clock frequency)
Perf. Monitoring Performance Monitoring Counters (PMC) revision 3
Available hardware counters per logical core:
3 fixed-function counters of 48 bit width
4 general-purpose counters of 48 bit width
Hypervisor No
Oh, I see.
12 physical cores, 24 logical cores (on executing CPU)
PC has 2 processors.
@GregPlowman Yes, your analysis is correct. You always only get information from and for the CPU you ask.
OK this makes sense. Thanks.
After a more careful look at the readme, I see a caveat about this:
Moreover, the cpuid instruction can only provide information for the executing physical CPU, called a package. To obtain information on all packages, and all physical and logical cores, the executing program must be pinned sequentially to each and every core, and gather its properties. This is how libuv, hwloc or the operating system obtain that kind information. However, this would require additional external or operating system dependent code which is not the scope of this package.
But further down, it states:
In particular CPU_CORES is the reason for this module: It's intrinsically unclear whether that number includes hyperthreading cores, or whether it is referring to real physical cores of the current machine.
Perhaps add a warning here that relying on CpuId.cpucores()
could also be misinterpreted for multi-processor computers. My own use case was determining the number of physical cores across all processors, to be used with addprocs()
Good point.
I've also encountered a situation in which the "number of physical cores" reported by CpuId.jl is not the same as that reported by Hwloc.jl: JuliaParallel/Hwloc.jl#40
I'm not sure which value is correct.
I think the machine that I am referring to in JuliaParallel/Hwloc.jl#40 is a two-socket machine. So I think I'm running into the same issue as @GregPlowman, where cpucores()
is only returning info for one of the sockets.
Actually, I think there is still a bug here.
Here is the output I get from lscpu
:
daluthge@node1111.oscar.ccv.brown.edu:/users/daluthge/Desktop$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Thread(s) per core: 1
Core(s) per socket: 12
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6126 CPU @ 2.60GHz
Stepping: 4
CPU MHz: 3299.829
CPU max MHz: 3700.0000
CPU min MHz: 1000.0000
BogoMIPS: 5200.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 19712K
NUMA node0 CPU(s): 0-11
NUMA node1 CPU(s): 12-23
And here's the output of lscpu -e -a
:
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ MINMHZ
0 0 0 0 0:0:0:0 yes 3700.0000 1000.0000
1 0 0 1 1:1:1:0 yes 3700.0000 1000.0000
2 0 0 2 2:2:2:0 yes 3700.0000 1000.0000
3 0 0 3 3:3:3:0 yes 3700.0000 1000.0000
4 0 0 4 4:4:4:0 yes 3700.0000 1000.0000
5 0 0 5 5:5:5:0 yes 3700.0000 1000.0000
6 0 0 6 6:6:6:0 yes 3700.0000 1000.0000
7 0 0 7 7:7:7:0 yes 3700.0000 1000.0000
8 0 0 8 8:8:8:0 yes 3700.0000 1000.0000
9 0 0 9 9:9:9:0 yes 3700.0000 1000.0000
10 0 0 10 10:10:10:0 yes 3700.0000 1000.0000
11 0 0 11 11:11:11:0 yes 3700.0000 1000.0000
12 1 1 12 12:12:12:1 yes 3700.0000 1000.0000
13 1 1 13 13:13:13:1 yes 3700.0000 1000.0000
14 1 1 14 14:14:14:1 yes 3700.0000 1000.0000
15 1 1 15 15:15:15:1 yes 3700.0000 1000.0000
16 1 1 16 16:16:16:1 yes 3700.0000 1000.0000
17 1 1 17 17:17:17:1 yes 3700.0000 1000.0000
18 1 1 18 18:18:18:1 yes 3700.0000 1000.0000
19 1 1 19 19:19:19:1 yes 3700.0000 1000.0000
20 1 1 20 20:20:20:1 yes 3700.0000 1000.0000
21 1 1 21 21:21:21:1 yes 3700.0000 1000.0000
22 1 1 22 22:22:22:1 yes 3700.0000 1000.0000
23 1 1 23 23:23:23:1 yes 3700.0000 1000.0000
And this is the output of cat /sys/devices/system/cpu/smt/active
:
$ cat /sys/devices/system/cpu/smt/active
0
As I understand it, this means that I have two physical sockets, and each socket has 12 physical cores, and each physical core has one thread (i.e. one logical core).
So, if I am correct, when I run CpuId.cpuinfo()
, it runs on a single physical socket, so it should report that I have 12 physical cores and 12 logical cores.
However, this is not what happens. Instead, CpuID.cpuinfo()
gives me:
julia> CpuId.cpuinfo()
Cpu Property Value
–––––––––––––––––– ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
Brand Intel(R) Xeon(R) Gold 6126 CPU @ 2.60GHz
Vendor :Intel
Architecture :Skylake
Model Family: 0x06, Model: 0x55, Stepping: 0x04, Type: 0x00
Cores 12 physical cores, 24 logical cores (on executing CPU)
Hyperthreading detected
Clock Frequencies 2600 / 3700 MHz (base/max), 100 MHz bus
Data Cache Level 1:3 : (32, 1024, 19712) kbytes
64 byte cache line size
Address Size 48 bits virtual, 46 bits physical
SIMD 512 bit = 64 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via `rdtsc`
TSC runs at constant rate (invariant from clock frequency)
Perf. Monitoring Performance Monitoring Counters (PMC) revision 4
Available hardware counters per logical core:
3 fixed-function counters of 48 bit width
8 general-purpose counters of 48 bit width
Hypervisor No
Shouldn't CpuId.cpuinfo()
return 12 physical cores, 12 logical cores (on executing CPU)
in this case? (Instead of 12 physical cores, 24 logical cores (on executing CPU)
?)
Also, CpuId.cpuinfo()
says Hyperthreading detected
, but since the output of cat /sys/devices/system/cpu/smt/active
was 0
, doesn't that mean that hyperthreading is disabled?
I quickly googled for the specs of the processor Xeon 6126 Gold you're querying. The spec sheet states 12 hardware cores and 24 logical cores.See e.g. here: https://ark.intel.com/content/www/de/de/ark/products/120483/intel-xeon-gold-6126-processor-19-25m-cache-2-60-ghz.htmlSo I guess the output of cpuid is correct?
Yeah I saw that as well. However, I think that is only if hyperthreading is enabled, right?
But if cat /sys/devices/system/cpu/smt/active
returns 0
, that means that the cluster admins have disabled hyperthreading on that machine, right? So if hyperthreading is disabled, shouldn't we only have 12 logical cores?
I get the same issue on my local machine when I disable hyperthreading: #43
@DilumAluthge The text you're refering to is simply created by comparing logical cores to physical cores. If there is a factor of two between them, then it writes "Hyperthreading detected".
I believe you have other expectations than this package can provide. It uses the hardware cpu instruction 'cpuid' to ask the processor of it's self-reported capabilities. This always only works for the processor this task runs on, and can only provide the answer which the processor reports. If the admin disables certain cores in the BIOS or OS, then the CPU does not report this. Thus, this is outside the scope of this tool.