m-j-w/CpuId.jl

Number of cores incorrectly reported?

Closed this issue · 11 comments

I think physical and logical cores are being reported incorrectly?

Julia-0.6.4> Sys.CPU_CORES
48

Julia-0.6.4> Hwloc.num_physical_cores()
24

Julia-0.6.4> CpuId.cpucores()
12

Julia-0.6.4> CpuId.cpuinfo()
Cpu Property       Value
–––––––––––––––––– ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
Brand              Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz
Vendor             :Intel
Architecture       :Broadwell
Model              Family: 6, Model: 79, Stepping: 1, Type: 0
Cores              12 physical cores, 24 logical cores (on executing CPU)
                   Hyperthreading detected
Clock Frequencies  Not supported by CPU
Data Cache         Level 1:3 : (32, 256, 30720) kbytes
                   64 byte cache line size
Address Size       48 bits virtual, 46 bits physical
SIMD               256 bit = 32 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via `rdtsc`
                   TSC runs at constant rate (invariant from clock frequency)
Perf. Monitoring   Performance Monitoring Counters (PMC) revision 3
                   Available hardware counters per logical core:
                   3 fixed-function counters of 48 bit width
                   4 general-purpose counters of 48 bit width
Hypervisor         No

Oh, I see.
12 physical cores, 24 logical cores (on executing CPU)
PC has 2 processors.

m-j-w commented

@GregPlowman Yes, your analysis is correct. You always only get information from and for the CPU you ask.

OK this makes sense. Thanks.

After a more careful look at the readme, I see a caveat about this:

Moreover, the cpuid instruction can only provide information for the executing physical CPU, called a package. To obtain information on all packages, and all physical and logical cores, the executing program must be pinned sequentially to each and every core, and gather its properties. This is how libuv, hwloc or the operating system obtain that kind information. However, this would require additional external or operating system dependent code which is not the scope of this package.

But further down, it states:

In particular CPU_CORES is the reason for this module: It's intrinsically unclear whether that number includes hyperthreading cores, or whether it is referring to real physical cores of the current machine.

Perhaps add a warning here that relying on CpuId.cpucores() could also be misinterpreted for multi-processor computers. My own use case was determining the number of physical cores across all processors, to be used with addprocs()

m-j-w commented

Good point.

I've also encountered a situation in which the "number of physical cores" reported by CpuId.jl is not the same as that reported by Hwloc.jl: JuliaParallel/Hwloc.jl#40

I'm not sure which value is correct.

I think the machine that I am referring to in JuliaParallel/Hwloc.jl#40 is a two-socket machine. So I think I'm running into the same issue as @GregPlowman, where cpucores() is only returning info for one of the sockets.

Actually, I think there is still a bug here.

Here is the output I get from lscpu:

daluthge@node1111.oscar.ccv.brown.edu:/users/daluthge/Desktop$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                24
On-line CPU(s) list:   0-23
Thread(s) per core:    1
Core(s) per socket:    12
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 85
Model name:            Intel(R) Xeon(R) Gold 6126 CPU @ 2.60GHz
Stepping:              4
CPU MHz:               3299.829
CPU max MHz:           3700.0000
CPU min MHz:           1000.0000
BogoMIPS:              5200.00
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              1024K
L3 cache:              19712K
NUMA node0 CPU(s):     0-11
NUMA node1 CPU(s):     12-23

And here's the output of lscpu -e -a:

CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ    MINMHZ
0   0    0      0    0:0:0:0       yes    3700.0000 1000.0000
1   0    0      1    1:1:1:0       yes    3700.0000 1000.0000
2   0    0      2    2:2:2:0       yes    3700.0000 1000.0000
3   0    0      3    3:3:3:0       yes    3700.0000 1000.0000
4   0    0      4    4:4:4:0       yes    3700.0000 1000.0000
5   0    0      5    5:5:5:0       yes    3700.0000 1000.0000
6   0    0      6    6:6:6:0       yes    3700.0000 1000.0000
7   0    0      7    7:7:7:0       yes    3700.0000 1000.0000
8   0    0      8    8:8:8:0       yes    3700.0000 1000.0000
9   0    0      9    9:9:9:0       yes    3700.0000 1000.0000
10  0    0      10   10:10:10:0    yes    3700.0000 1000.0000
11  0    0      11   11:11:11:0    yes    3700.0000 1000.0000
12  1    1      12   12:12:12:1    yes    3700.0000 1000.0000
13  1    1      13   13:13:13:1    yes    3700.0000 1000.0000
14  1    1      14   14:14:14:1    yes    3700.0000 1000.0000
15  1    1      15   15:15:15:1    yes    3700.0000 1000.0000
16  1    1      16   16:16:16:1    yes    3700.0000 1000.0000
17  1    1      17   17:17:17:1    yes    3700.0000 1000.0000
18  1    1      18   18:18:18:1    yes    3700.0000 1000.0000
19  1    1      19   19:19:19:1    yes    3700.0000 1000.0000
20  1    1      20   20:20:20:1    yes    3700.0000 1000.0000
21  1    1      21   21:21:21:1    yes    3700.0000 1000.0000
22  1    1      22   22:22:22:1    yes    3700.0000 1000.0000
23  1    1      23   23:23:23:1    yes    3700.0000 1000.0000

And this is the output of cat /sys/devices/system/cpu/smt/active:

$ cat /sys/devices/system/cpu/smt/active
0

As I understand it, this means that I have two physical sockets, and each socket has 12 physical cores, and each physical core has one thread (i.e. one logical core).

So, if I am correct, when I run CpuId.cpuinfo(), it runs on a single physical socket, so it should report that I have 12 physical cores and 12 logical cores.

However, this is not what happens. Instead, CpuID.cpuinfo() gives me:

julia> CpuId.cpuinfo()
  Cpu Property       Value
  –––––––––––––––––– ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
  Brand              Intel(R) Xeon(R) Gold 6126 CPU @ 2.60GHz
  Vendor             :Intel
  Architecture       :Skylake
  Model              Family: 0x06, Model: 0x55, Stepping: 0x04, Type: 0x00
  Cores              12 physical cores, 24 logical cores (on executing CPU)
                     Hyperthreading detected
  Clock Frequencies  2600 / 3700 MHz (base/max), 100 MHz bus
  Data Cache         Level 1:3 : (32, 1024, 19712) kbytes
                     64 byte cache line size
  Address Size       48 bits virtual, 46 bits physical
  SIMD               512 bit = 64 byte max. SIMD vector size
  Time Stamp Counter TSC is accessible via `rdtsc`
                     TSC runs at constant rate (invariant from clock frequency)
  Perf. Monitoring   Performance Monitoring Counters (PMC) revision 4
                     Available hardware counters per logical core:
                     3 fixed-function counters of 48 bit width
                     8 general-purpose counters of 48 bit width
  Hypervisor         No

Shouldn't CpuId.cpuinfo() return 12 physical cores, 12 logical cores (on executing CPU) in this case? (Instead of 12 physical cores, 24 logical cores (on executing CPU)?)

Also, CpuId.cpuinfo() says Hyperthreading detected, but since the output of cat /sys/devices/system/cpu/smt/active was 0, doesn't that mean that hyperthreading is disabled?

m-j-w commented

I quickly googled for the specs of the processor Xeon 6126 Gold you're querying. The spec sheet states 12 hardware cores and 24 logical cores.See e.g. here: https://ark.intel.com/content/www/de/de/ark/products/120483/intel-xeon-gold-6126-processor-19-25m-cache-2-60-ghz.htmlSo I guess the output of cpuid is correct?

Yeah I saw that as well. However, I think that is only if hyperthreading is enabled, right?

But if cat /sys/devices/system/cpu/smt/active returns 0, that means that the cluster admins have disabled hyperthreading on that machine, right? So if hyperthreading is disabled, shouldn't we only have 12 logical cores?

I get the same issue on my local machine when I disable hyperthreading: #43

m-j-w commented

@DilumAluthge The text you're refering to is simply created by comparing logical cores to physical cores. If there is a factor of two between them, then it writes "Hyperthreading detected".

I believe you have other expectations than this package can provide. It uses the hardware cpu instruction 'cpuid' to ask the processor of it's self-reported capabilities. This always only works for the processor this task runs on, and can only provide the answer which the processor reports. If the admin disables certain cores in the BIOS or OS, then the CPU does not report this. Thus, this is outside the scope of this tool.

m-j-w commented

Added the warning for usage on multi-processor hardware, and merged #45