open-mpi/hwloc

Non contiguous physical numbering of cores

antoine-morvan opened this issue · 2 comments

What version of hwloc are you using?

  • hwloc-bind 2.10.0
  • custom build (default flags) on RHEL8.8

Which operating system and hardware are you running on?

  • CPU: Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz
  • RHEL 8.8
$ lstopo -
Machine (220GB total)
  Package L#0
    NUMANode L#0 (P#0 125GB)
    L3 L#0 (25MB)
      L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
        PU L#0 (P#0)
        PU L#1 (P#20)
      L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
        PU L#2 (P#1)
        PU L#3 (P#21)
      L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
        PU L#4 (P#2)
        PU L#5 (P#22)
      L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
        PU L#6 (P#3)
        PU L#7 (P#23)
      L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4
        PU L#8 (P#4)
        PU L#9 (P#24)
      L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5
        PU L#10 (P#5)
        PU L#11 (P#25)
      L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6
        PU L#12 (P#6)
        PU L#13 (P#26)
      L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7
        PU L#14 (P#7)
        PU L#15 (P#27)
      L2 L#8 (256KB) + L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8
        PU L#16 (P#8)
        PU L#17 (P#28)
      L2 L#9 (256KB) + L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9
        PU L#18 (P#9)
        PU L#19 (P#29)
    HostBridge
      PCIBridge
        PCI 03:00.0 (Ethernet)
          Net "enp3s0f0"
        PCI 03:00.1 (Ethernet)
          Net "enp3s0f1"
      PCI 00:11.4 (SATA)
        Block(Disk) "sda"
      PCIBridge
        PCIBridge
          PCI 06:00.0 (VGA)
      PCI 00:1f.2 (SATA)
        Block(Disk) "sdb"
  Package L#1
    NUMANode L#1 (P#1 94GB)
    L3 L#1 (25MB)
      L2 L#10 (256KB) + L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10
        PU L#20 (P#10)
        PU L#21 (P#30)
      L2 L#11 (256KB) + L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11
        PU L#22 (P#11)
        PU L#23 (P#31)
      L2 L#12 (256KB) + L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12
        PU L#24 (P#12)
        PU L#25 (P#32)
      L2 L#13 (256KB) + L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13
        PU L#26 (P#13)
        PU L#27 (P#33)
      L2 L#14 (256KB) + L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14
        PU L#28 (P#14)
        PU L#29 (P#34)
      L2 L#15 (256KB) + L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15
        PU L#30 (P#15)
        PU L#31 (P#35)
      L2 L#16 (256KB) + L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16
        PU L#32 (P#16)
        PU L#33 (P#36)
      L2 L#17 (256KB) + L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17
        PU L#34 (P#17)
        PU L#35 (P#37)
      L2 L#18 (256KB) + L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18
        PU L#36 (P#18)
        PU L#37 (P#38)
      L2 L#19 (256KB) + L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19
        PU L#38 (P#19)
        PU L#39 (P#39)
    HostBridge
      PCIBridge
        PCI 81:00.0 (InfiniBand)
          Net "ib0"
          OpenFabrics "mlx5_0"

This is an old bisocket machine.

Details of the problem

We were playing with logical and physical numbering. We encountered a strange behavior on this particular machine, where the physical numbering is not contiguous. Indeed, we can observe when running lstopo -p that there is no core with physical ID from 5 to 7 (on both packages) :

$ lstopo -p --no-io --ignore pu
Machine (220GB total)
  Package P#0
    NUMANode P#0 (125GB)
    L3 P#0 (25MB)
      L2 P#0 (256KB) + L1d P#0 (32KB) + L1i P#0 (32KB) + Core P#0
      L2 P#1 (256KB) + L1d P#1 (32KB) + L1i P#1 (32KB) + Core P#1
      L2 P#2 (256KB) + L1d P#2 (32KB) + L1i P#2 (32KB) + Core P#2
      L2 P#3 (256KB) + L1d P#3 (32KB) + L1i P#3 (32KB) + Core P#3
      L2 P#4 (256KB) + L1d P#4 (32KB) + L1i P#4 (32KB) + Core P#4
      L2 P#8 (256KB) + L1d P#8 (32KB) + L1i P#8 (32KB) + Core P#8
      L2 P#9 (256KB) + L1d P#9 (32KB) + L1i P#9 (32KB) + Core P#9
      L2 P#10 (256KB) + L1d P#10 (32KB) + L1i P#10 (32KB) + Core P#10
      L2 P#11 (256KB) + L1d P#11 (32KB) + L1i P#11 (32KB) + Core P#11
      L2 P#12 (256KB) + L1d P#12 (32KB) + L1i P#12 (32KB) + Core P#12
  Package P#1
    NUMANode P#1 (94GB)
    L3 P#1 (25MB)
      L2 P#16 (256KB) + L1d P#16 (32KB) + L1i P#16 (32KB) + Core P#0
      L2 P#17 (256KB) + L1d P#17 (32KB) + L1i P#17 (32KB) + Core P#1
      L2 P#18 (256KB) + L1d P#18 (32KB) + L1i P#18 (32KB) + Core P#2
      L2 P#19 (256KB) + L1d P#19 (32KB) + L1i P#19 (32KB) + Core P#3
      L2 P#20 (256KB) + L1d P#20 (32KB) + L1i P#20 (32KB) + Core P#4
      L2 P#24 (256KB) + L1d P#24 (32KB) + L1i P#24 (32KB) + Core P#8
      L2 P#25 (256KB) + L1d P#25 (32KB) + L1i P#25 (32KB) + Core P#9
      L2 P#26 (256KB) + L1d P#26 (32KB) + L1i P#26 (32KB) + Core P#10
      L2 P#27 (256KB) + L1d P#27 (32KB) + L1i P#27 (32KB) + Core P#11
      L2 P#28 (256KB) + L1d P#28 (32KB) + L1i P#28 (32KB) + Core P#12

Even hwloc-calc behaves similarly:

$ hwloc-calc core:4 -p -I core --oo
Core:4
$ hwloc-calc core:5 -p -I core --oo
Core:8

Attached is the gathered topology.

  • But why does the numbering of the cores skips some values ?
  • Also, the physical numberiing of the PU seems OK (contiguous). Why this different behavior ?

Best.

PU numbering is determined by the OS based on APIC IDs, it's made contigous on purpose, otherwise things would be annoying in practice. Core numbering is pretty much useless, it's only shown in debugging messages, hence the OS doesn't change anything there. My understanding is that it's often non-contigous in hardware because most CPU SKUs have some cores disabled but the enabled cores still get the same cores as if all cores were enabled.

n practice. Core numbering is pretty much useless, it's only shown in debugging messages, hence the OS doesn't change anything there. My understanding is that it's often non-contigous in hardware because most CPU SKUs have some cores disabled but the enabled cores still get the same cores as if all cores were enabled.

Thanks for the quick answer; another reason to use logical numbering :)