openflighthpc/flight-profile

Support genders-style syntax for multi-node apply

Closed this issue · 12 comments

If I have 10 nodes and I want to apply to all I need to do:`

[root@login1 ~]# flight profile apply node01,node02,node03,node04,node05,node06,node07,node08,node09,node10 compute

It'd be really useful if I could use genders syntax such that the following would work:

[root@login1 ~]# flight profile apply node[01-10] compute

Some example cases for different formatting scenarios, just checking that each of these is desired behaviour.

Standard case:

node[01-12] =>
node01,node02,node03,node04,node05,node06,node07,node08,node09,node10,node11,node12

Start is shorter than end, length of start takes priority:

node[1-012] =>
node1,node2,node3,node4,node5,node6,node7,node8,node9,node10,node11,node12

Start has more padding than end, length of start takes priority:

node[00001-12] =>
node00001,node00002,node00003,node00004,node00005,node00006,node00007,node00008,node00009,node00010,node00011,node00012

In the event of multiple valid segments of range syntax being present, the rightmost will always take priority:

node[06-17][01-12] =>
"node01,node02,node03,node04,node05,node06,node07,node08,node09,node10,node11,node12"

Nodes which do not end in [X-X] (where X is a string of one or more numeric characters) will be ignored for the purposes of this system and will be treated as single labels, e.g. node[1234], node[12-15, node[12 - 16] would be considered to be individual labels, instead of raising errors or requests for better formatting. The intent behind this is to have all other node names be unaffected by these changes, unless they end in this specific node formatting style.

That's looking to make sense. I think we shouldn't be doing multiple square brackets anyway but the program having some sensible way of handling it is a good thing.

Might be useful for some further reference - https://github.com/chaos/genders/blob/master/TUTORIAL

Nodes which do not end in [X-X] (where X is a string of one or more numeric characters) will be ignored for the purposes of this system and will be treated as single labels, e.g. node[1234], node[12-15, node[12 - 16] would be considered to be individual labels

I think this is a reasonable approach to handling non-syntax-compliant strings but does concern me that strange things could then occur. I think it's best to try and handle it as similarly as genders does (from putting the string you mentioned into /tmp/test and parsing with nodeattr -f /tmp/test --expand):

# node[01-12]
# > Result as you outlined

# node[1-012]
# > Result as you outlined

# node[00001-12]
# > Result as you outlined 

# node[06-17][01-12]
# > "nodeattr: genders_getattr: node or attribute not found"

Is there any such existing gem that handles gender expansion that we could use to perform this? @VoxSecundus may be able to provide some further info on existing solutions, I have an inkling some of our tools may have done this before 🤔

I don't believe there's any gem made in the last 10 years that can do this, no. There is a rather old Ruby implementation of nodeattr which contains some parsing logic that may be useful (particularly NodeAttr::explode_nodes), but I'd be interested to see how @Womblue approaches a problem like this 🙂

The actual gender expansion itself is essentially done (the outputs I listed are the actual outputs of the snippet I've got for expanding the given string), would you prefer that we raise errors in the case that formatting isn't correct? We could implement some extra validation, something like:

  • Node labels can't have square brackets in them
  • An argument given in Profile's apply can only have one set of square brackets, and it must be at the end.
  • If the square brackets don't match or don't contain the valid formatting of numbers, raise an error and inform the user

Worth noting that the first of these would also require changes to Hunter (which doesn't restrict node labels in any way to my knowledge)

Node labels can't have square brackets in them

I think applying restrictions at different points in our tools for hostname formatting could lead to unforeseen issues unless we have a suite-wide standard for it (which, imo, is also not worth it!). Quick Edit: I see you've just mentioned something along these lines too, I think the scope is too big for what it's worth on enforcing hostname standards

An argument given in Profile's apply can only have one set of square brackets, and it must be at the end.

This solution seems a bit restrictive but seems along the right lines. Perhaps like how we have --regex in hunter to indicate "hey try doing some regex with my node argument", we could potentially do the same here (although integrated expansion of node ranges would be more ideal for me).

If the square brackets don't match or don't contain the valid formatting of numbers, raise an error and inform the user

I like this. I'm far more for failing loudly and clearly than potentially doing things the user doesn't expect from their input.

We could implement some extra validation

I would recommend looking at the regex that the nodeattr gem I posted above uses for this. Worth remembering that the genders syntax supports any kind of range, not just integer ranges. I was misreading the nodeattr docs. It does not support string ranges.

An argument given in Profile's apply can only have one set of square brackets, and it must be at the end.

I agree that integrated range expansion should be the goal here; I wouldn't expect to have to use another CLI argument to specify a range.

Although, @ColonelPanicks this links to what we spoke about a while ago with regards to Flight Hunter label restrictions. If a user likes to name their nodes silly things with brackets (square or otherwise), how is Flight Profile supposed to know which square brackets denote nodeattr ranges and which are just part of the name?

Worth remembering that the genders syntax supports any kind of range, not just integer ranges.

Is this desired behaviour @ColonelPanicks? I understand this may be useful for genders specifically, but considering that node suffixes are generated by hunter, are we going to need to recognised non-integer ranges?

If the square brackets don't match or don't contain the valid formatting of numbers, raise an error and inform the user

I like this. I'm far more for failing loudly and clearly than potentially doing things the user doesn't expect from their input.

The issue I have here is at what point we decide that a user has typed a node name incorrectly and to raise an error. I don't think it's out of the question a node could be named something like mynode[login] or something to a similar effect, and it obviously wouldn't be correct to raise an error to a user attempting to apply to this node and tell them they haven't formatted their range correctly. Edit: I see @VoxSecundus has spotted this problem too.

Although, @ColonelPanicks this links to what we spoke about a while ago with regards to Flight Hunter label restrictions. If a user likes to name their nodes silly things with brackets (square or otherwise), how is Flight Profile supposed to know which square brackets denote nodeattr ranges and which are just part of the name?

As with most development things I think this falls into the "too difficult to handle, too easy to break" category. There's no way we can fully prevent ridiculous inputs from causing further issues and I think the best we can do in that situation is user education instead of trying to handle this in the program itself

Is this desired behaviour @ColonelPanicks? I understand this may be useful for genders specifically, but considering that node suffixes are generated by hunter, are we going to need to recognised non-integer ranges?

I can't see any evidence of genders syntax handling ranges of letters so I'm not sure if that's true and I'm not particularly fussed about it being handled by our tool as it stands. Perhaps me labelling this issue as "Gender syntax" has added additional confusion when, ultimately, I think it's either of:

  • Expanding numerical ranges
  • Generally supporting regex input

Either is fine with me, supporting regex in general is easier on the dev side but harder (or at least has more of a barrier to entry) on the user side, and vice versa for the numerical expansion which is more user-friendly. Given that I've already got a decent implementation of the numerical expansion I'm leaning towards that.

As with most development things I think this falls into the "too difficult to handle, too easy to break" category. There's no way we can fully prevent ridiculous inputs from causing further issues and I think the best we can do in that situation is user education instead of trying to handle this in the program itself

Does "user education" imply that any string ending in a segment of square brackets must be an instance of this new range aspect? So node[login]example would be a valid (albeit strange) node formatting, whereas node[login] would prompt the user to correct their invalid range formatting?

I think we continue with genders-style instead of regex.

I don't like the restriction of strings ending in square brackets as it's a bit too restrictive (as I mentioned earlier). Plus it's far more likely for there to be some sort of numerical range in the middle of some hostname formatting instead of square brackets in hostnames.

[root@jupyter1 (jup1) ~]# cat /tmp/test
node[01-04]rack1
[root@jupyter1 (jup1) ~]# nodeattr --expand -f /tmp/test
node01rack1
node02rack1
node03rack1
node04rack1

I think it's generally appreciated that alphanumeric names (along with maybe dashes or underscores) are used in Linux. Trying to handle edge-cases of hostnames is outside of scope. If a user were to put . in their hostname it would probably cause all sorts of DNS issues... DNS systems don't warn about it, users/admins should just not do it!

Resolved in #67.