Invalid version number for flux!
vsoch opened this issue · 8 comments
It looks like the library here is linked to specific flux versions:
[2023-06-17 05:41:16: ERROR] Submission failed -- Message (invalid version number '0.49.0-225-g53e087510').
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/maestrowf/interfaces/script/_flux/flux0_26_0.py", line 64, in submit
cls.connect_to_flux()
File "/usr/local/lib/python3.8/dist-packages/maestrowf/abstracts/interfaces/flux.py", line 29, in connect_to_flux
broker_version = StrictVersion(broker_version)
File "/usr/lib/python3.8/distutils/version.py", line 40, in __init__
self.parse(vstring)
File "/usr/lib/python3.8/distutils/version.py", line 137, in parse
raise ValueError("invalid version number '%s'" % vstring)
ValueError: invalid version number '0.49.0-225-g53e087510
It might be hard to keep these two things in sync - and if the API for flux is stable, arguably this might be overkill. Can we talk about possible strategies for handling this? E.g., right now flux core (I think) is at 0.51.0 and I think possibly we should be more flexible to allow a range, and then come here and set a stopping point for said range only when we know there is an interface change.
And to keep you in the loop, I've started my small repository of examples here: https://github.com/rse-ops/maestro-examples. I think a strategy I will advocate for is to run this inside of a container, and have all software dependencies ready to go! At least in a cloud context, you want wait to wait for one build step to download and build something (akin to what lulesh does, unless it's really really fast!) But tonight is the first time I'm trying this out, so it's an early impression.
I'm wanting to integrate this with our PerfFlow tool, so I think next I'll try to find something in Python that can have the PerfFlow wrappers.
That's an interesting issue; have been testing things out with 0.49 and haven't run into that yet. As for smarter version selection based on flux api changes, adding more flexible ranges would be nice, and is certainly something we can do on the maestro side of things. One question there is is there any ~automated way to figure that out vs just having someone reading changelogs/etc and pinning those api changes manually, or more likely waiting for a bug ticket from a user finding it like this?
Aside from flux --version
:
# flux --version
commands: 0.46.0
libflux-core: 0.46.0
libflux-security: 0.8.0
build-options: +caliper+hwloc==1.11.6+zmq==4.2.5
And different versions for the various specs (e.g., JobSet or Resource) I don't think we expose that, but we could! Let me look into it, and then we can do a subprocess with flux --version
and fall back to the newer command (which will eventually be dominant).
@jwhite242 here is a fallback you can use:
import flux
h = flux.Flux()
print(h.attr_get("version"))
And we are working on a more Pythonic way:
Oh sweet, python access to that would be cool! Also, just realized lookign at your original ticket that this might just be a version parsing issue too given you appear to be using some sort of 'local' or 'pre-release' version number here rather than the likely expected length 3 tuple of a release version -> the '-225-g53e087510' bit on the end. That being said, we should be able to trivially use python's built in parsing to fix that since pep440 knows how to deal with those already, enabling easy >=, <=, == comparisons in here.
ah yes that would do it - the version can be parsed from git, so I think Maestro might be flexible to that. Let me know if/when there is something new to try or if I can otherwise help! I made a library just for parsing container tags, and likely some of the ideas there translate to the version string here: https://github.com/vsoch/pipelib/
Ok @vsoch , think the version parsing/logging should be sorted out now: the broker version parser gets tested against several semver patterns including the flux dev build strings you reported here. let me know if there's some other variants flux brokers might report and need testing here.
Also, the correct version should now get dumped into the step's batch file headers and the logs too so there's better tracking of what it was actually tested with. The flux broker and maestro versions are logged separately too to make things more clear.
okay! I don't have this testing environment setup, but if it's fixed please feel free to close the issue. Thank you and have a great weekend!