NAICNO/Jobanalyzer

naicreport should be able to use -remote -cluster -auth-file in the same way as sonalyze

Opened this issue · 1 comments

This is technical debt. naicreport must run on naic-monitor and it runs sonalyze against a local data store. But really it should be running sonalyze against a remote daemon, and should therefore be moveable to other systems. It would also benefit from the caching that the daemon performs, too, taking a lot of I/O load off naic-monitor when these reports run.

(A related reality is that the various components of naicreport share very little code and could be broken out as separate reports sharing command parsing machinery and maybe a simple and less ad-hoc database component for their state. So worth keeping that in mind when rewriting.)

A couple of minor complications:

  • the glance report requires a config file which it uses to enumerate host names and for each host, decide whether the host has a gpu or not (related to #401).
  • the hostnames report is basically not implementable with the current sonalyze daemon because it greps the report directory directly. This is broken though (#330): it really should be using the config file too.

The most sensible fix might be to implement a config or hosts verb in sonalyze that can extract host information. Hosts could be filtered as per normal (so we could ask for eg gpu-[1-20] on fox and not get all the compute nodes, this makes a lot of sense on bigger clusters maybe). There would be field selectors in the usual way, and the usual formats would be supported.

Then glance would run sonalyze hosts -fmt csv,host,cards to get the host names and the GPU info, and hostnames would simply run sonalyze hosts -fmt csv,host and we'd be done.