/pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Primary LanguageC++OtherNOASSERTION

pytorch/.github

This directory contains workflows and scripts to support our CI infrastructure that runs on Github Actions.

Workflows / Templates

Our current Github Actions setup uses templates written in Jinja that are located in the .github/templates directory to generate workflow files found in the .github/workflows/ directory.

These templates contain a couple of utility templates used to discern common utilities that can be used amongst different templates.

(Re)Generating workflow files

You will need jinja2 in order to regenerate the workflow files which can be installed using:

pip install -r .github/requirements.txt

Workflows can be generated / regenerated using the following command:

.github/regenerate.sh

Adding a new generated workflow

New generated workflows can be added in the .github/scripts/generate_ci_workflows.py script. You can reference examples from that script in order to add the workflow to the stream that is relevant to what you particularly care about.

Different parameters can be used to acheive different goals, i.e. running jobs on a cron, running only on trunk, etc.

ciflow (specific)

ciflow is the way we can get non-default workflows to run on specific PRs. Within the generate_ci_workflows.py script you will notice a multitude of LABEL_CIFLOW_<NAME> variables which correspond to labels on Github. Workflows that do not run on ``LABEL_CIFLOW_DEFAULTcan be triggered on PRs by applying the label found ingenerate_ci_workflows.py`

Example:

    CIWorkflow(
        arch="linux",
        build_environment="periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck",
        docker_image_base=f"{DOCKER_REGISTRY}/pytorch/pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7",
        test_runner_type=LINUX_CUDA_TEST_RUNNER,
        num_test_shards=2,
        distributed_test=False,
        timeout_after=360,
        # Only run this on master 4 times per day since it does take a while
        is_scheduled="0 */4 * * *",
        ciflow_config=CIFlowConfig(
            labels={LABEL_CIFLOW_LINUX, LABEL_CIFLOW_CUDA, LABEL_CIFLOW_SLOW_GRADCHECK, LABEL_CIFLOW_SLOW, LABEL_CIFLOW_SCHEDULED},
        ),
    ),

This workflow does not get triggered by default since it does not contain the LABEL_CIFLOW_DEFAULT label in its CIFlowConfig but applying the LABEL_CIFLOW_SLOW_GRADCHECK on your PR will trigger this specific workflow to run.

ciflow (trunk)

The label ciflow/trunk can be used to run trunk only workflows. This is especially useful if trying to re-land a PR that was reverted for failing a non-default workflow.

Infra

Currently most of our self hosted runners are hosted on AWS, for a comprehensive list of available runner types you can reference .github/scale-config.yml.

Exceptions to AWS for self hosted:

  • ROCM runners

Adding new runner types

New runner types can be added by committing changes to .github/scale-config.yml. Example: pytorch#70474

NOTE: New runner types can only be used once the changes to .github/scale-config.yml have made their way into the default branch