Logging feature request
lsawade opened this issue · 1 comments
One of the great features of nnodes
is a direct task logging ability while the jobs is running. We didn't know we needed this until we had it. Very simply put, it's a command line tool that let's you keep track of all jobs that are running, done, failed, or to be submitted. The example output for two moment tensor inversions is as follows:
- C090497A
0) iteration
0) mpi-create-dir-C090497A (04:50)
1) forward_frechet (15:47)
2) processing-all
- C090497A_process_data (running - 01:29)
- process_synthetics
- C090497A_process_synt (running - 01:29)
- C090497A_process_dsdm00000 (running - 01:29)
- C090497A_process_dsdm00001 (running - 01:29)
- C090497A_process_dsdm00002 (running - 01:29)
- C090497A_process_dsdm00003 (running - 01:29)
- C090497A_process_dsdm00004 (running - 01:29)
- C090497A_process_dsdm00005 (running - 01:29)
- C090497A_process_dsdm00006 (running - 01:29)
- C090497A_process_dsdm00007 (running - 01:29)
- C090497A_process_dsdm00008 (running - 01:29)
- C090497A_process_dsdm00009 (running - 01:29)
3) mpiexec_window
4) compute_weights
5) compute_cgh
6) compute_descent
7) compute_optvals
8) linesearch
9) iteration_check
- B092894B
0) iteration
0) mpi-create-dir-B092894B (04:51)
1) forward_frechet
- forward (14:47)
- frechet
- mpiexec_xspecfem3D (running - 15:44)
- mpiexec_xspecfem3D (14:49)
- mpiexec_xspecfem3D (14:48)
- mpiexec_xspecfem3D (14:49)
- mpiexec_xspecfem3D (running - 14:52)
- mpiexec_xspecfem3D (running - 14:51)
- mpiexec_xspecfem3D (14:49)
- mpiexec_xspecfem3D (14:47)
- mpiexec_xspecfem3D (running - 15:44)
2) processing-all
3) mpiexec_window
4) compute_weights
5) compute_cgh
6) compute_descent
7) compute_optvals
8) linesearch
9) iteration_check
where
-
indicates tasks that can be run concurrently1,2,3,..
indicate sequentially run tasks- tasks that are
- done - have a
(hh:mm:ss)
stamp - running - have
(running - hh:mm:ss)
stamp - waiting to be executed have nothing
- are a parent to running tasks have nothing
- done - have a
In the backend, nnodes
uses a dict
to keep track of submission times, execution times etc. with start and endtime attributes for each task. The attributes are simply read and printed after reading the dictionary.
What I imagine is quite similar that could be called like:
radical-log-workflow <session.id>
and output a log with
-
for pipelines1,2,3,...
for stages-
again for tasks
and similar timestamps.
Yes, that is indeed neat - accepted as feature request.