heliumdatacommons/data-commons-workspace

Expose workflow logs through iRODS user's home directory

Closed this issue · 2 comments

Right now the jupyter container redirects output from background processes to a log in /var/log/datacommons

For when running workflows in the container from pivot, the only real way to see the workflow progress is to look at the docker log on the worker node.

The output from the toil execution should be tee'd to a log file in the same log directory as jupyter.

Then in the base image, add an httpd entry to map to this directory.

Updating this based on discussion with @stevencox, as the priority for this issue has been increased. We can accept a workflow execution guid from the launcher environment and if present, send all workflow logging to the user's home directory (which should be the directory mounted by default by the iRODS client in the container) under .log/<workflowname>/guid/{stdout.log,stderr.log}. This would enable any client application connected to the same iRODS server as that user to tail those log files corresponding to the workflow it launched.

Should be able to do this by adding a tee redirect (with -a) to each 'automated' command in the start script. We probably do not want to log all output from interactive shell sessions.
https://github.com/theferrit32/data-commons-workspace/blob/master/docker/datacommons-base/base-start.sh
https://github.com/theferrit32/data-commons-workspace/blob/master/docker/datacommons-jupyter/jupyter-start.sh

We could also combine stdout/stderr for easier monitoring by the user-facing application (ex: commonsshare)

Finished this. Persistent workflow logs accessible in irods user's <zone>/home/<user>/.log or in the containers under /renci/irods/home/<user>/.log

They're named with the type of task that generated it (ex: _toil_worker, _toil_exec) then with a timestamp appended.