Quick introduction for using github actions for python code.
The readme will walk through some information, and then the included ##-ci*.yml
files show workflows of increasing complexity. To use these for your own
project, copy into the .github/workflows/
folder (creating it if it doesn't
exist) and modify as needed.
Github actions are one way of performing continuous
integration (CI), which
runs automatic tests regularly ("continuous") before making any changes to the
main codebase ("integration"). There are other platforms for doing so (Travis
CI, Circle CI, etc.), but Github actions has the benefit of being built into
Github. They all work in a fairly similarly: define a bunch of steps to run and
when you want them to run, generally in a .yml
file (see
here for more details). The platforms
described above provide servers (github-hosted
runners
as github calls them) to run the tests on, which are free for open source
projects (though they have limited resources and never include GPUs). Many of
these services can also be self-hosted, and thus run on your own server (e.g.,
Flatiron uses Jenkins, which allows us to use
Flatiron clusters for GPUs).
CI commonly includes tests and linters to make sure the code runs and meets a desired code style.
For software development, CI is often used to make sure that any new changes don't break old code and to enforce a uniformity of style, which makes understanding the code easier. It is thus incredibly useful (I would argue, necessary).
For research software, it is not as obviously necessary. However, I've used CI for research code to ensure that my code is installable and some basic functionality runs. This means I know someone else will be able to use it (e.g., that I included all necessary dependencies) and means I'm aware when a change in one of my dependencies (or my own code!) breaks something basic.
It can also be used in more complex cases. The Journal of Open Source Software uses it to build papers from markdown, using the Github issues framework to centralize the review process. Here's fRAT, a package for analyzing functional MRI data that I reviewed.
I've heard of people using it to generate all the figures for their scientific papers, but my analyses have always required too much time and memory to be feasible for this.
Okay, all that aside, how does one do this? Github's documentation on its actions are quite thorough, and a bit intimidating. As CI's generally used by software engineers, most of what you'll find is targeted at them, rather than scientists, and can be frustrating to approach. Here, we'll provide some basic examples that are hopefully useful.
Let's look at the first workflow, 00-ci.yml
. This simple workflow runs every
time we push to the repository and checks that we're installable with conda
(requires an environment.yml
file) and pip
(requires setup.py
or something
similar).
Several things are going on here:
- we specify the name of the workflow and how often the CI runs at the top.
- we have two separate jobs, which will run in parallel.
yml
files are hierarchical, based on indent level.check_pip_install
andcheck_conda_install
are both one level underjobs
and thus are independent jobs. - Each job specifies the operating system it runs on (in this case,
ubuntu-latest
) and then the steps required. - The steps are enumerated using
-
and run in sequence! If one of them fails, the whole job will fail and quit out. - The three steps here (
checkout
,setup-python
/setup-conda
, andInstall dependencies
/Create environment
) will be required for every job you create:checkout
checks out your repo from github, the following sets up either python or conda (which includes python) to run your code, and the third actually installs it. Note this mirrors what your users will do! - Note that the setup steps include some configuration including, critically, the python version. We'll return to that later.
- Many of the arguments here are keywords. See the github actions docs and the docs for specific actions (e.g., setup-python) to understand them.
In 00-ci.yml
, we ran our simple tests with python 3.9 on ubuntu-latest
. What
if you want to test more OSs and python versions? You do that using the build
matrix, as demonstrated ina 01-ci-build-matrix.yml
.
-
We've removed the
check_conda_install
job so can just focus on the build matrix here, but you can do the same thing for that job, if you'd like. -
If we look at the diff between the
check_pip_install
for these two files (diff 00-ci.yml 01-ci-build-matrix.yml
), we see the following:21c7,11 - runs-on: ubuntu-latest --- + runs-on: ${{ matrix.os }} + strategy: + matrix: + os: [ubuntu-latest, macos-latest, windows-latest] + python-version: [3.8, 3.9, '3.10'] 27c17 - python-version: 3.9 --- + python-version: ${{ matrix.python-version }}
-
Instead of specifying
runs-on
andpython-version
directly, we're using some strange curly-bracket syntax. These are contexts, which we're accessing usingGithub action's expressions syntax. For our purposes, the use of the lists undermatrix
gives us a for loop over those values, which we access using${{ matrix.os }}
/${{ matrix.python-version }}
. -
Github actions will generate all combinations here, so running this action will give us nine workflows which all run in parallel.
So far, we've beeen running all our tests every time we push, which is probably
way too often. Let's change that with 02-ci-freq.yml
, which will build off of
01-ci-build-matrix.yml
.
Running diff 01-ci-build-matrix.yml 02-ci-freq.yml
highlights the following
differences at the top of the file:
3c3,9
- push:
---
+ pull_request:
+ branches:
+ - main
+ - development
+ schedule:
+ - cron: 0 0 * * 0 # weekly
+ workflow_dispatch:
Now, instead of running everytime we push, we have three different triggers:
pull_request: branches:
Everytime we make a pull request with the specified branches (here,main
anddevelopment
), this workflow will run. This is useful when using git flow for software development to ensure all merged changes pass all tests.schedule:
This specific line here specifies that we should run the workflow weekly (it follows the syntax of cron, a standard unix utility for scheduling jobs; it can be cryptic but thankfully there are tools to help you generate them). This is useful for ensuring that a change in one of your dependencies doesn't randomly break your code.workflow_dispatch:
This allows us to manually trigger the workflow, which can be handy.
These are the options I often use. See github docs for more info on the possible options here.
So far, our test is super basic. What else can we do with CI? Basically,
whatever you want (with open source code). Let's look at 03-ci-tests.yml
,
which builds off of 01-ci-build-matrix.yml
.
Running diff 01-ci-build-matrix.yml 03-ci-tests.yml
highlights the following
at the end of our code:
6c6,30
- check_pip_install:
---
+ lint:
+ runs-on: ubuntu-latest
+ steps:
+ - uses: actions/checkout@v3
+ - name: Install Python 3
+ uses: actions/setup-python@v4
+ with:
+ python-version: 3.9
+ cache: pip
+ cache-dependency-path: setup.py
+ - name: Install dependencies
+ run: |
+ # using the --upgrade and --upgrade-strategy eager flags ensures that
+ # pip will always install the latest allowed version of all
+ # dependencies, to make sure the cache doesn't go stale
+ pip install --upgrade --upgrade-strategy eager .
+ pip install black isort flake8
+ - name: Lint
+ run: |
+ black --check my_package/
+ isort --check my_package/
+ flake8 my_package/ --max-complexity 10
+
+ tests:
+ needs: lint
25a50,52
+ pip install pytest
+ - name: Run pytest
+ run: pytest tests/
Here, we've added a new step, lint
, and changed check_pip_install
to
tests
.
- If you look at
lint
carefully, you can see that it's basically the same as our previous rules except:- we've gone back to a single OS and python version (because the outputs of this shouldn't depend on that)
- we have a new
pip install
line, which installs the linters:black
,isort
, andflake8
- we added a new step,
Lint
, which runs all three on a (non-existant) directory namedmy_package/
.
- By passing the
--check
flag, any of these linters will cause the job to fail if our code doesn't meet their standards. This way we know our code looks like what we expect! - Similarly, our
tests
step is the exact same ascheck_pip_install
from before (including the use of the build matrix for OS and python) except:- we install pytest, a nice testing framework for python.
- we add a step,
Run pytest
, which runspytest
on a (non-existant) directory namedtests/
. - we added a
needs: lint
line. This means that the two jobs will not run in parallel, but thattests
will wait forlint
to succeed before running. This means we won't run tests if the linters don't pass.
- Similar to the linters, if any of the tests fail, the whole job will fail.
For both of these, we're just taking some arbitrary code that we can (and should!) run locally and making sure it gets run everytime the workflow gets triggered. Automating them in this way ensures all these checks are run whenever necessary, reducing cognitive load on the developers.
But we could include any code here! See the test_snakefile
and
run_Freeman_check_notebook
from my
foveated-metamers
github (research code supporting a publication) for more complex examples. In
the first job, I run a basic
snakemake command to make sure
that my automated installation and setup steps allow users to run snakemake, a
workflow manager I use in the project. In the second, I setup the environment
and run a notebook using jupyter execute
to make sure anyone who downloads my
code can do the same.
It doesn't have to call external code either, you can simply include several
bash lines, as I do in the no_extra_nblinks
job from
plenoptic.
Here, I'm just counting the number of nblink
files that live under the
docs/tutorials/
directory and ensuring that it's the same as the number of
ipynb
files that live under the examples/
directory (which is useful because
of how I build my docs). Any code that checks something and raises an error
status (e.g., exit 1
in bash or raise Exception
in python) if your
assumptions aren't met can be useful!
When developing a software package that is used by anyone other than yourself, you generally don't want to accidentally break something. Writing tests to check that things work is a great first step, and CI makes sure that they run regularly, so you don't have to remember to do so. But so far, nothing is forcing you to make sure your tests pass, so you could (when tired or stressed or annoyed) ignore them completely.
To avoid this situation, you can use branch protection
rules
to ensure that all required status checks
pass
before merging any changes into a given branch (such as main
or
development
).
For our purposes, this means making sure that all the workflows pass. You can do
this by manually adding individual jobs, but there can be many of them
(especially if you use build matrices) and will require adding/removing them as
you change the tests you run. Simpler is to use the alls-green
action, as
demonstrated in 04-ci-check.yml
.
Running diff 03-ci-tests.yml 04-ci-check.yml
shows us the following:
30d29
- needs: lint
52a52,63
+
+ check:
+ if: always()
+ needs:
+ - lint
+ - tests
+ runs-on: ubuntu-latest
+ steps:
+ - name: Decide whether all tests and notebooks succeeded
+ uses: re-actors/alls-green@afee1c1eac2a506084c274e9c02c8e0687b48d9e # v1.2.2
+ with:
+ jobs: ${{ toJSON(needs) }}
- We've removed the dependency between
lint
andtests
. Now, the two rules will run independently and in parallel. - We've added a new rule,
check
, which will only pass if all of the jobs listed underneeds
(including the many versions spawned by the build matrix) have passed. - We can then say that this single status check is required before merging, since it will only pass if everything else passes (we just need to remember to update it as we add new jobs!).
Ran out of time, but see plenoptic
for an example of how to automatically
deploy with an
action,
as described
here.
Based on python packaging guide.
I recommend you first deploy to the Test PyPI server and ensure your package is installable from there, but I haven't set that up yet.