/pytest-workflow

Configure workflow/pipeline tests using yaml files.

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0

pytest-workflow

More information on how to cite pytest-workflow here.

pytest-workflow is a workflow-system agnostic testing framework that aims to make pipeline/workflow testing easy by using YAML files for the test configuration. Whether you write your pipelines in WDL, snakemake, nextflow, bash or any other workflow framework, pytest-workflow makes testing easy. pytest-workflow is build on top of the pytest test framework.

For our complete documentation and examples checkout our readthedocs page.

Installation

Pytest-workflow requires Python 3.7 or higher. It is tested on Python 3.7, 3.8, 3.9, 3.10 and 3.11.

  • Make sure your virtual environment is activated.
  • Install using pip pip install pytest-workflow
  • Create a tests directory in the root of your repository.
  • Create your test yaml files in the tests directory.

Pytest-workflow is also available as a conda package on conda-forge. Follow these instructions to set up channels properly in order to use conda-forge. Alternatively, you can set up the channels correctly for use with bioconda. After that conda install pytest-workflow can be used to install pytest-workflow.

Quickstart

Run pytest from an environment with pytest-workflow installed. Pytest will automatically gather files in the tests directory starting with test and ending in .yaml or .yml.

To check the progress of a workflow while it is running you can use tail -f on the stdout or stderr file of the workflow. The locations of these files are reported in the log as soon as a workflow is started.

For debugging pipelines using the --kwd or --keep-workflow-wd flag is recommended. This will keep the workflow directory and logs after the test run so it is possible to check where the pipeline crashed. The -v flag can come in handy as well as it gives a complete overview of succeeded and failed tests.

Below is an example of a YAML file that defines a test:

- name: Touch a file
  command: touch test.file
  files:
    - path: test.file

This will run touch test.file and check afterwards if a file with path: test.file is present. It will also check if the command has exited with exit code 0, which is the only default test that is run. Testing workflows that exit with another exit code is also possible. Several other predefined tests as well as custom tests are possible.

- name: moo file                     # The name of the workflow (required)
  command: bash moo_workflow.sh      # The command to execute the workflow (required)
  files:                             # A list of files to check (optional)
    - path: "moo.txt"                # File path. (Required for each file)
      contains:                      # A list of strings that should be in the file (optional)
        - "moo"
      must_not_contain:              # A list of strings that should NOT be in the file (optional)
        - "Cock a doodle doo"
      md5sum: e583af1f8b00b53cda87ae9ead880224   # Md5sum of the file (optional)
      encoding: UTF-8                # Encoding for the text file (optional). Defaults to system locale.

- name: simple echo                  # A second workflow. Notice the starting `-` which means
  command: "echo moo"                # that workflow items are in a list. You can add as much workflows as you want
  files:
    - path: "moo.txt"
      should_exist: false            # Whether a file should be there or not. (optional, if not given defaults to true)
  stdout:                            # Options for testing stdout (optional)
    contains:                        # List of strings which should be in stdout (optional)
      - "moo"
    must_not_contain:                # List of strings that should NOT be in stout (optional)
      - "Cock a doodle doo"
    encoding: ASCII                  # Encoding for stdout (optional). Defaults to system locale.

- name: mission impossible           # Also failing workflows can be tested
  tags:                              # A list of tags that can be used to select which test
    - should fail                    # is run with pytest using the `--tag` flag.
  command: bash impossible.sh
  exit_code: 2                       # What the exit code should be (optional, if not given defaults to 0)
  files:
    - path: "fail.log"               # Multiple files can be tested for each workflow
    - path: "TomCruise.txt.gz"       # Gzipped files can also be searched, provided their extension is '.gz'
      contains:
        - "starring"
      extract_md5sum: e27c52f6b5f8152aa3ef58be7bdacc4d   # Md5sum of the uncompressed file (optional)
  stderr:                            # Options for testing stderr (optional)
    contains:                        # A list of strings which should be in stderr (optional)
      - "BSOD error, please contact the IT crowd"
    must_not_contain:                # A list of strings which should NOT be in stderr (optional)
      - "Mission accomplished!"
    encoding: UTF-16                 # Encoding for stderr (optional). Defaults to system locale.

- name: regex tests
  command: echo Hello, world
  stdout:
    contains_regex:                  # A list of regex patterns that should be in stdout (optional)
      - 'Hello.*'                    # Note the single quotes, these are required for complex regexes
      - 'Hello .*'                   # This will fail, since there is a comma after Hello, not a space

    must_not_contain_regex:          # A list of regex patterns that should not be in stdout (optional)
      - '^He.*'                      # This will fail, since the regex matches Hello, world
      - '^Hello .*'                  # Complex regexes will break yaml if double quotes are used

For more information on how Python parses regular expressions, see the Python documentation.

Documentation for more advanced use cases including the custom tests can be found on our readthedocs page.