jobs are "INITIALIZED" but not starting
BenWibking opened this issue · 3 comments
I have a test study running on my laptop that appears stuck in this state:
===================================================================================================================================================================================
Step Name Job ID Workspace State Run Time Elapsed Time Start Time Submit Time End Time Number Restarts
-------------------- -------- -------------------- ----------- -------------- -------------- ------------------- ------------------- ------------------- -----------------
generate-profile_0.1 26102 generate-profile/0.1 FINISHED 0d:00h:00m:02s 0d:00h:00m:02s 2024-01-29 13:54:55 2024-01-29 13:54:55 2024-01-29 13:54:57 0
generate-profile_0.3 26108 generate-profile/0.3 FINISHED 0d:00h:00m:02s 0d:00h:00m:02s 2024-01-29 13:54:57 2024-01-29 13:54:57 2024-01-29 13:54:59 0
generate-profile_1.0 26111 generate-profile/1.0 FINISHED 0d:00h:00m:02s 0d:00h:00m:02s 2024-01-29 13:54:59 2024-01-29 13:54:59 2024-01-29 13:55:01 0
generate-infile_0.1 26303 generate-infile/0.1 FINISHED 0d:00h:00m:02s 0d:00h:00m:02s 2024-01-29 13:56:01 2024-01-29 13:56:01 2024-01-29 13:56:03 0
generate-infile_0.3 26322 generate-infile/0.3 FINISHED 0d:00h:00m:01s 0d:00h:00m:01s 2024-01-29 13:56:03 2024-01-29 13:56:03 2024-01-29 13:56:04 0
generate-infile_1.0 26340 generate-infile/1.0 FINISHED 0d:00h:00m:02s 0d:00h:00m:02s 2024-01-29 13:56:04 2024-01-29 13:56:04 2024-01-29 13:56:06 0
run-sim_0.1 -- run-sim/0.1 INITIALIZED --:--:-- --:--:-- -- -- -- 0
run-sim_0.3 -- run-sim/0.3 INITIALIZED --:--:-- --:--:-- -- -- -- 0
run-sim_1.0 -- run-sim/1.0 INITIALIZED --:--:-- --:--:-- -- -- -- 0
===================================================================================================================================================================================
The subdirectories for run-sim_0.1, run-sim_0.3, and run-sim_1.0 don't have any files in them, except for the subdirectory for run-sim_0.1, which has a bash script that was generated from the workflow.
Is there any way to figure out what it's doing and why it appears to be stuck?
top
shows that the simulation correspoinding to run-sim_0.1 is running.
Is there some output buffering that would explain why I don't see any log files?
Unfortunately, that's currently expected i think for the local adapter. That one currently appears to run in a blocking manner and waits for the subprocess (steps' bash script) to finish before it writes out the .out/.err log files. We do plan to unblock that with an executor backend to make it behave like the HPC adapters, but that's currently only in a dev branch at the moment.
Thanks for the explanation and quick reply.