fulcrumgenomics/dagr

I want to know all the inputs and outputs produced by the pipeline, and who made them

nh13 opened this issue · 0 comments

nh13 commented

At the end of the execution, I want a report that has the following columns:

  • task name/id/etc.
  • input/output file
  • path to file
  • exists at the end of the pipeline (was intermediate)

I am not sure how to implement it, but perhaps with some annotations (@input, @output, @delete)?When a task is scheduled, inputs are added. When tasks execute successfully, the outputs are added (and checked). When a task deletes an file, the file is marked as deleted. For the tasks that delete files, we could instead use @input(deleted=true).