This directory contains workflows and scripts to support our CI infrastructure that runs on Github Actions.
Our current Github Actions setup uses templates written in Jinja that are located in the
.github/templates
directory to generate workflow files found in the .github/workflows/
directory.
These templates contain a couple of utility templates used to discern common utilities that can be used amongst different templates.
You will need jinja2
in order to regenerate the workflow files which can be installed using:
pip install -r .github/requirements.txt
Workflows can be generated / regenerated using the following command:
.github/regenerate.sh
New generated workflows can be added in the .github/scripts/generate_ci_workflows.py
script. You can reference
examples from that script in order to add the workflow to the stream that is relevant to what you particularly
care about.
Different parameters can be used to acheive different goals, i.e. running jobs on a cron, running only on trunk, etc.
ciflow is the way we can get non-default
workflows to run on specific PRs. Within the generate_ci_workflows.py
script
you will notice a multitude of LABEL_CIFLOW_<NAME>
variables which correspond to labels on Github. Workflows that
do not run on ``LABEL_CIFLOW_DEFAULTcan be triggered on PRs by applying the label found in
generate_ci_workflows.py`
Example:
CIWorkflow(
arch="linux",
build_environment="periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck",
docker_image_base=f"{DOCKER_REGISTRY}/pytorch/pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7",
test_runner_type=LINUX_CUDA_TEST_RUNNER,
num_test_shards=2,
distributed_test=False,
timeout_after=360,
# Only run this on master 4 times per day since it does take a while
is_scheduled="0 */4 * * *",
ciflow_config=CIFlowConfig(
labels={LABEL_CIFLOW_LINUX, LABEL_CIFLOW_CUDA, LABEL_CIFLOW_SLOW_GRADCHECK, LABEL_CIFLOW_SLOW, LABEL_CIFLOW_SCHEDULED},
),
),
This workflow does not get triggered by default since it does not contain the LABEL_CIFLOW_DEFAULT
label in its CIFlowConfig but applying
the LABEL_CIFLOW_SLOW_GRADCHECK
on your PR will trigger this specific workflow to run.
The label ciflow/trunk
can be used to run trunk
only workflows. This is especially useful if trying to re-land a PR that was
reverted for failing a non-default
workflow.
Currently most of our self hosted runners are hosted on AWS, for a comprehensive list of available runner types you
can reference .github/scale-config.yml
.
Exceptions to AWS for self hosted:
- ROCM runners
New runner types can be added by committing changes to .github/scale-config.yml
. Example: pytorch#70474
NOTE: New runner types can only be used once the changes to
.github/scale-config.yml
have made their way into the default branch