Clarify what constitutes a valid group in the hwloc_topology_insert_group_object() docs
Closed this issue · 6 comments
If I reverse-engineered the API contract correctly, hwloc_topology_insert_group_object()
expects to receive (via the pre-initialized group object) the union of the cpusets/nodesets of N sibling objects below the same parent. It then does the conceptual equivalent of the following sequence of operations:
- Reorder the group children in their common parent's child list so that the corresponding objects end up lying at consecutive logical indices.
- Create a new group depth below the depth of the parent object if need be.
- Create a group object at this depth with the original parent as parent and the group members as children + reassign group members' parent pointers to point to the group.
- Remove left sibling bidirectional link of the leftmost object in the group children and make it the left sibling of the group instead + same idea with right sibling of the rightmost object.
Assuming my understanding is correct, it would be nice if the docs of hwloc_topology_insert_group_object()
could clarify that groups may only be composed of N sibling objects lying below the same parent.
Your understanding looks correct but we cannot document grouping as "children" only. From the user point of view, you may actually want to group some grand-children, which is functionally equivalent to grouping their parents. For instance on a machine with 4 sockets of 2 cores each, grouping 2 sockets is equivalent to grouping the corresponding 4 cores.
In short, I may need to take your doc but put your "conceptually equivalent" earlier.
I initially started from the "you can group any objects at any depth" perspective, since that's what the hwloc_obj_add_other_obj_sets()
API somewhat hints at. But I found that when you think of things this way, it is hard to understand some important points of the hwloc_topology_insert_group_object()
API contract:
- One would expect any set of object to form a valid group, but actually only some combinations of objects can be grouped (e.g. I cannot group CPUs 0, 1 and 4 of a given socket).
- One would expect the newly created
Group
object to become the parent of the objects that you're grouping, but it can become the parent of one of their ancestors instead (e.g. on an x86 machine where CPU 0 and 1 are hyperthreads of the first core and CPU 2 and 3 are hyperthreads of the second core, "grouping these PUs" with cpuset 0-3 will actually group the associated L2 caches).
Both of these confusions are easily cleared by the "you can only group sibling objects" viewpoint, which is why I was proposing that the docs be written in terms of this viewpoint instead.
Ok I see, indeed, then I guess we can start from your viewpoint but add a note saying that it's actually more general but with lots of constraints.
Here's a proposal:
/** \brief Add more structure to the topology by adding an intermediate Group
*
* The caller should first allocate a new Group object with hwloc_topology_alloc_group_object().
* Then it must setup at least one of its CPU or node sets to specify
* the final location of the Group in the topology.
* Then the object can be passed to this function for actual insertion in the topology.
*
* The main use case for this function is to group a subset of
* siblings among the list of children below a single parent.
* For instance, if grouping 4 cores out of a 8-core socket,
* the logical list of cores will be reordered so that the 4 grouped
* ones are consecutive.
* Then, if needed, a new depth is added between the parent and those
* children, and the Group is inserted there.
* At the end, the 4 grouped cores are now children of the Group,
* which replaces them as a child of the original parent.
*
* In practice, the grouped objects are specified through cpusets
* and/or nodesets, for instance using hwloc_obj_add_other_obj_sets()
* iteratively.
* Hence it is possible to Group objects that are not children of the
* same parent, for instance some PUs below the 4 cores in example above.
* However this general case may fail if the expected Group conflicts
* with the existing hierarchy.
* For instance if each core has two PUs, it is not possible to insert
* a Group containing a single PU of each core.
*
* To specify the objects to group, either the cpuset or nodeset field
* (or both, if compatible) must be set to a non-empty bitmap.
* The complete_cpuset or complete_nodeset may be set instead if
* inserting with respect to the complete topology
* (including disallowed, offline or unknown objects).
* These sets cannot be larger than the current topology, or they would get
* restricted silently.
* The core will setup the other sets after actual insertion.
*
[...]
I like it! 👍
[...] * Hence it is possible to Group objects that are not children of the * same parent, for instance some PUs below the 4 cores in example above. [...]
Probably Group
should have dropped the capital 'G' (as we refer to the verb, not the object type).