Samsung/qaboard

Pipelines / DAG

Opened this issue · 1 comments

Currently QA-Board lacks expressiveness for our common use-case of:

  1. Run on some images
  2. Calibration
  3. Validation
    Likewise, we can't express easily pipelines like training-evaluation.

We need to express running series of steps / pipelines / tasks organized as directed-acyclic-graph.

We're looking for feedback or alternative ideas. Especially if you have experience with various flow engines, e.g. DVC. Thanks!

Workarounds

User have done this:

  • wrapped qa batch with a scripted pipeline
  • wrote complicated run() function with lots of logic

Status

  • Implement user-side support for sequential pipelines
  • Support pipelines officially in QA-Board
  • Support DAGs

Possible API

batch1:
  inputs:
  - A.jpg
  - B.jpg
  configurations:
  - base

batch2:
  needs: batch1
  type: script
  configurations:
  - python my_script.py {o.output_dir for o in needs["batch1"]}

More complex:

my-calibration-images:
    configurations:
    - base
    inputs:
    - DL50.raw
    - DL55.raw
    - DL65.raw
    - DL75.raw

my-calibration:
    needs:
      calibration_images: my-calibration-images
    type: script
    configurations:
    - python calibration.py ${o.output_directory for o in depends[calibration_images]}

my-evaluation-batch:
    needs:
      calibration: my-calibration
    inputs:
    - test_image_1.raw
    - test_image_2.raw
    - test_image_3.raw
    configurations:
    - base
    - ${depends[calibration].output_directory}/calibration.cde
$ qa batch my-evaluation-batch
#=> qa batch my-calibration-images
#=> qa batch my-calibration
#=> qa batch my-evaluation-batch

Thoughts

  • We should add built-in support for script input types, than just executes their config as commands. It goes well with DAGs.
my-script:
  needs: batch1
  type: script
  configurations:
  - echo OK

Expected

  • Easy API
  • Cache friendly
  • Can be used in a non-blocking way

Update: thanks to Itamar Persi and Ela Shahar, there is a pipeline implementation in "user-land":

my-pipeline:
  configs:
  - run: echo "Step 1"
  - batch: first-batch
  - batch:
    - second-batch
    - third-batch
    - label: batches running in parallel
  - run: some-postprocessing-script.py

Features include

  • using PIPELINE_OUTPUT_DIR to save data across the batch
  • providing to run steps info on the previous batch (what qa batch --list returns)

It's much simpler than a full DAG, and good enough in most cases.

Next steps

  • We'll contribute it to the project as a default run() if the input type is pipeline
  • Until then, we'll provide the code here on request as sample code (just comment here)...