adjtomo/seisflows

system parameter ntask_max is not honored for certain subclasses

Opened this issue · 0 comments

Certain System sub classes that do not support array jobs (e.g., Frontera, Wisteria). The work around implementation is to submit individual jobs to the system one by one. However, these modules have no mechanism for controlling the parameter ntask_max and so will submit all jobs simultaneously to the job scheduler.

This is not the intended behavior and may lead to resource competition or upset sysadmins. These systems need their own internal ntask_max routine which only submits ntask_max jobs at once, and monitors the queue, submitting new jobs when previous jobs complete.

I think all the requisite pieces are there, just requires implementation and testing. I think what will be the biggest hurdle is the live checking of a job queue and the decision to submit new jobs, this can sometimes be a finicky operation.