Marge-bot
Marge-bot is a merge-bot for GitLab that, beside other goodies, implements the Not Rocket Science Rule Of Software Engineering:
automatically maintain a repository of code that always passes all the tests.
— Graydon Hoare, main author of Rust
This simple rule of thumb is still nowadays surprisingly difficult to implement with the state-of-the-art tools, and more so in a way that scales with team size (also see our blog post).
Take, for instance, GitHub's well-known pull-request workflow. Here, CI needs to pass on the branch before the pull request can be accepted but after that, the branch is immediately merged (or rebased) into master. By the time this happens, enough changes may have occurred to induce test breakage, but this is only to be found out when the commits have already landed.
GitLab (in their enterprise edition), offers an important improvement here with their semi-linear history and fast-forward merge request methods: in both cases a merge request can only be accepted if the resulting master branch will be effectively the same as the merge request branch on which CI has passed. If master has changed since the tests were last ran, it is the user's responsibility to rebase the changes and retry. But this just doesn't scale: if you have, a mono-repo, a large team working on short-lived branches, a CI pipeline that takes 5-10 minutes to complete... then the number of times one need's to rebase-and-try-to-accept starts to become unbearable.
Marge-bot offers the simplest of workflows: when a merge-request is ready, just assign it to its user, and let her do all the rebase-wait-retry for you. If anything goes wrong (merge conflicts, tests that fail, etc.) she'll leave a message on the merge-request, so you'll get notified. Marge-bot can handle an adversarial environment where some developers prefer to merge their own changes, so the barrier for adoption is really low.
Since she is at it, she can optionally provide some other goodies like tagging
of commits (e.g. Reviewed-by: ...
) or preventing merges during certain hours.
Configuring
Args that start with '--' (eg. --auth-token) can also be set in a config file (specified via --config-file). The config file uses YAML syntax and must represent a YAML 'mapping' (for details, see http://learn.getgrav.org/advanced/yaml). If an arg is specified in more than one place, then commandline values override environment variables which override config file values which override defaults.
optional arguments:
-h, --help show this help message and exit
--config-file CONFIG_FILE
config file path [env var: MARGE_CONFIG_FILE] (default: None)
--auth-token TOKEN Your GitLab token.
DISABLED because passing credentials on the command line is insecure:
You can still set it via ENV variable or config file, or use "--auth-token-file" flag.
[env var: MARGE_AUTH_TOKEN] (default: None)
--auth-token-file FILE
Path to your GitLab token file.
[env var: MARGE_AUTH_TOKEN_FILE] (default: None)
--gitlab-url URL Your GitLab instance, e.g. "https://gitlab.example.com".
[env var: MARGE_GITLAB_URL] (default: None)
--ssh-key KEY The private ssh key for marge so it can clone/push.
DISABLED because passing credentials on the command line is insecure:
You can still set it via ENV variable or config file, or use "--ssh-key-file" flag.
[env var: MARGE_SSH_KEY] (default: None)
--ssh-key-file FILE Path to the private ssh key for marge so it can clone/push.
[env var: MARGE_SSH_KEY_FILE] (default: None)
--embargo INTERVAL[,..]
Time(s) during which no merging is to take place, e.g. "Friday 1pm - Monday 9am"
or "Fri 12:30 Europe/London - Mon 08:00 Europe/London"
[env var: MARGE_EMBARGO] (default: None)
--use-merge-strategy Use git merge instead of git rebase to update the *source* branch (EXPERIMENTAL)
If you need to use a strict no-rebase workflow (in most cases
you don't want this, even if you configured gitlab to use merge requests
to use merge commits on the *target* branch (the default).)
[env var: MARGE_USE_MERGE_STRATEGY] (default: False)
--add-tested Add "Tested: marge-bot <$MR_URL>" for the final commit on branch after it passed CI.
[env var: MARGE_ADD_TESTED] (default: False)
--batch Enable processing MRs in batches
[env var: MARGE_BATCH] (default: False)
--add-part-of Add "Part-of: <$MR_URL>" to each commit in MR.
[env var: MARGE_ADD_PART_OF] (default: False)
--add-reviewers Add "Reviewed-by: $approver" for each approver of MR to each commit in MR.
[env var: MARGE_ADD_REVIEWERS] (default: False)
--impersonate-approvers
Marge-bot pushes effectively don't change approval status.
[env var: MARGE_IMPERSONATE_APPROVERS] (default: False)
--merge-order The order you want marge to merge its requests.
As of earliest merge request creation time (created_at) or update time (updated_at)
[env var: MARGE_MERGE_ORDER] (default: created_at)
--approval-reset-timeout APPROVAL_RESET_TIMEOUT
How long to wait for approvals to reset after pushing.
Only useful with the "new commits remove all approvals" option in a project's settings.
This is to handle the potential race condition where approvals don't reset in GitLab
after a force push due to slow processing of the event.
[env var: MARGE_APPROVAL_RESET_TIMEOUT] (default: 0s)
--project-regexp PROJECT_REGEXP
Only process projects that match; e.g. 'some_group/.*' or '(?!exclude/me)'.
[env var: MARGE_PROJECT_REGEXP] (default: .*)
--ci-timeout CI_TIMEOUT
How long to wait for CI to pass.
[env var: MARGE_CI_TIMEOUT] (default: 15min)
--max-ci-time-in-minutes MAX_CI_TIME_IN_MINUTES
Deprecated; use --ci-timeout.
[env var: MARGE_MAX_CI_TIME_IN_MINUTES] (default: None)
--git-timeout GIT_TIMEOUT
How long a single git operation can take.
[env var: MARGE_GIT_TIMEOUT] (default: 120s)
--git-reference-repo GIT_REFERENCE_REPO
A reference repo to be used when git cloning.
[env var: MARGE_GIT_REFERENCE_REPO] (default: None)
--branch-regexp BRANCH_REGEXP
Only process MRs whose target branches match the given regular expression.
[env var: MARGE_BRANCH_REGEXP] (default: .*)
--debug Debug logging (includes all HTTP requests etc).
[env var: MARGE_DEBUG] (default: False)
Here is a config file example
add-part-of: true
add-reviewers: true
add-tested: true
# chose one way of specifying the Auth token
#auth-token: TOKEN
auth-token-file: token.FILE
branch-regexp: .*
ci-timeout: 15min
embargo: Friday 1pm - Monday 9am
batch: false
git-timeout: 120s
gitlab-url: "https://gitlab.example.com"
impersonate-approvers: true
project-regexp: .*
# chose one way of specifying the SSH key
#ssh-key: KEY
ssh-key-file: token.FILE
For more information about configuring marge-bot see --help
Running
First, create a user for Marge-bot on your GitLab. We'll use marge-bot
as
username here as well. GitLab sorts users by Name, so we recommend you pick one
that starts with a space, e.g. Marge Bot
, so it is quicker to assign to (our
code strips trailing whitespace in the name, so it won't show up elsewhere).
Then add marge-bot
to your projects as Developer
or Master
, the latter
being required if she will merge to protected branches.
For certain features, namely, --impersonate-approvers
, and --add-reviewers
,
you will need to grant marge-bot
admin privileges as well. In the latter, so
that she can query the email of the reviewers to include it in the commit. Note
that if you're trying to run marge-bot against a GitLab instance you don't have
yourself admin access to (e.g. https://www.gitlab.com), you won't be able to use
features that require admin for marge-bot.
Second, you need an authentication token for the marge-bot
user. You will need
to select the api
and read_user
scopes in all cases.
If marge-bot was made an admin to handle approver impersonation and/or adding a
reviewed-by field, then you will also need to add sudo
scope under
Impersonation Tokens
in the User Settings. Assuming your GitLab install is
install is https://your-gitlab.example.com
the link will be at
https://your-gitlab.example.com/admin/users/marge-bot/impersonation_tokens
).
On older GitLab installs, to be able to use impersonation features if marge-bot
was made an admin, use the PRIVATE TOKEN found in marge-bot's Profile Settings
; otherwise just use a personal token (you will need to impersonate the
marge-bot user via the admin UI to get the private token, it should then be at
http://my-gitlab.example.com/profile/personal_access_tokens
reachable via
Profile Settings -> Acess Tokens
).
Once you have the token, put it in a file, e.g. marge-bot.token
.
Finally, create a new ssh key-pair, e.g like so
ssh-keygen -t ed25519 -C marge-bot@invalid -f marge-bot-ssh-key -P ''
Add the public key (marge-bot-ssh-key.pub
) to the user's SSH Keys
in GitLab
and keep the private one handy.
Running marge-bot in docker (what we do)
Assuming you have already got docker installed, the quickest and most minimal way to run marge is like so (but see note about passing secrets on the commandline below):
docker run --restart=on-failure \ # restart if marge crashes because GitLab is flaky
-e MARGE_AUTH_TOKEN="$(cat marge-bot.token)" \
-e MARGE_SSH_KEY="$(cat marge-bot-ssh-key)" \
smarkets/marge-bot \
--gitlab-url='http://your.gitlab.instance.com'
Note that other users on the machine can see the secrets in ps
, because
although they are env vars inside docker, we used a commandline switch to set
them for docker run.
To avoid that you have several options. You can just use a yaml file and mount that into the container, for example this is how we actually run marge-bot at Smarkets ourselves:
# marge-bot-config.yml
add-part-of: true
add-reviewers: true
add-tested: true
impersonate-approvers: true
gitlab-url: "https://git.corp.smarkets.com"
project-regexp: "smarkets/smarkets$"
auth-token: "WoNtTelly0u"
ssh-key: |
-----BEGIN OPENSSH PRIVATE KEY-----
[...]
-----END OPENSSH PRIVATE KEY-----
docker run --restart=on-failure \
-v "$(pwd)":/configuration \
smarkets/marge-bot \
--config-file=/configuration/marge-bot-config.yaml
By default docker will use the latest
tag, which corresponds to the latest
stable version. You can also use the stable
tag to make this more explicit.
If you want a development version, you can use the master
tag to obtain an
image built from the HEAD commit of the master
branch. Note that this image
may contain bugs.
You can also specify a particular version as a tag, e.g.
smarkets/marge-bot:0.7.0
.
Running marge-bot in kubernetes
It's also possible to run marge in kubernetes, e.g. here's how you use a ktmpl template:
ktmpl ./deploy.yml \
--parameter APP_NAME "marge-bot" \
--parameter APP_IMAGE "smarkets/marge-bot" \
--parameter KUBE_NAMESPACE "marge-bot" \
--parameter MARGE_GITLAB_URL 'http://your.gitlab.instance.com' \
--parameter MARGE_AUTH_TOKEN "$(cat marge-bot.token)" \
--parameter MARGE_SSH_KEY "$(cat marge-bot-ssh-key)" \
--parameter REPLICA_COUNT 1 | kubectl -n=${KUBE_NAMESPACE} apply --force -f -
Once running, the bot will continuously monitor all projects that have its user as a member and will pick up any changes in membership at runtime.
Running marge-bot as a plain python app
Installing marge-bot with nix
Alternatively, if you prefer not to use docker, you can also directly run marge.
If you use nix do nix-env --install -f default.nix
.
The nix install should be fully reproducible on any version of linux (and also work on OS X, although this is not something we properly test). If you don't want to use docker we recommend you give nix a try.
Installing marge-bot the old-fashioned way
Finally, although this is our least preferred alternative, you can always do
python3 setup.py install
(note that you will need python3.6).
Afterwards, the minimal way to run marge is as follows.
marge.app --auth-token-file marge-bot.token \
--gitlab-url 'http://your.gitlab.instance.com' \
--ssh-key-file marge-bot-ssh-key
However, we suggest you use a systemd unit file or some other mechanism to automatically restart marge-bot in case of intermittent GitLab problems.
Suggested workflow
-
Alice creates a new merge request and assigns Bob and Charlie as reviewers
-
Both review the code and after all issues they raise are resolved by Alice, they approve the merge request and assign it to
marge-bot
for merging. -
Marge-bot rebases the latest target branch (typically master) into the merge-request branch and pushes it. Once the tests have passed and there is a sufficient number of approvals (if a minimal approvals limit has been set on the project), Marge-bot will merge (or rebase, depending on project settings) the merge request via the GitLab API. It can also add some headers to all commits in the merge request as described in the next section.
Adding Reviewed-by:, Tested: and Part-of: to commit messages
Marge-bot supports automated addition of the following
two standardized git commit trailers:
Reviewed-by
and Tested-by
. For the latter it uses Marge Bot <$MERGE_REQUEST_URL>
as a slight abuse of the convention (here Marge Bot
is
the name of the marge-bot
user in GitLab).
If you pass --add-reviewers
and the list of approvers is non-empty and you
have enough approvers to meet the required approver count, Marge will add the
following header to each commit message and each reviewer as it rebases the
target branch into your PR branch:
Reviewed-by: A. Reviewer <a.reviewer@example.com>
All existing Reviewed-by:
trailers on commits in the branch will be stripped. This
feature requires marge to run with admin privileges due to a peculiarity of the
GitLab API: only admin users can obtain email addresses of other users, even
ones explicitly declared as public (strangely this limitation is particular to
email, Skype handles etc. are visible to everyone).
If you pass --add-tested
the final commit message in a PR will be tagged with
Tested-by: marge-bot <$MERGE_REQUEST_URL>
trailer. This can be very useful for
two reasons:
- Seeing where stuff "came from" in a rebase-based workflow
- Knowing that a commit has been tested, which is e.g. important for bisection
so you can easily and automatically
git bisect --skip
untested commits.
Additionally, by using --add-part-of
, all commit messages will be tagged with
a Part-of: <$MERGE_REQUEST_URL>
trailer to the merge request on which they
were merged. This is useful, for example, to go from a commit shown in git blame
to the merge request on which it was introduced or to easily revert a all
commits introduced by a single Merge Request when using a fast-forward/rebase
based merge workflow.
Impersonating approvers
If you want a full audit trail, you will configure GitLab
require approvals
for PRs and also turn on
reset approvals on push.
Unfortunately, since Marge-bot's flow is based on pushing to the source branch, this
means it will reset the approval status if the latter option is enabled.
However, if you have given Marge-bot admin privileges and turned on
--impersonate-approvers
, she will re-approve the merge request assuming after its own
push, but by impersonating the existing approvers.
Merge embargoes
Marge-bot can be configured not to merge during certain periods. E.g., to prevent
her from merging during weekends, add --embargo 'Friday 6pm - Monday 9am'
.
This is useful for example if you automatically deploy from master and want to
prevent shipping late on a Friday, but still want to allow marking merge requests as
"to be merged on Monday": just assign them to marge-bot
as any other day.
More than one embargo period can be specified, separated by commas. Any merge request assigned to her during an embargo period, will be merged in only once all embargoes are over.
Batching Merge Requests
The flag --batch enables testing and merging merge requests in batches. This can significantly speed up the rate at which marge-bot processes jobs - not just because merge requests can be tested together, but because marge-bot will ensure the whole set of merge requests is mergeable first. This includes, for example, checking if a merge request is marked as WIP, or does not have enough approvals. Essentially, users get faster feedback if there is an issue. Note that you probably won't need this unless you have tens of merge requests a day (or extremely slow CI).
How it works
If marge-bot finds multiple merge requests to deal with, she attempts to create a batch job. She filters the merge requests such that they have all have a common target branch, and eliminates those that have not yet passed CI (a heuristic to help guarantee the batch will pass CI later).
Once the merge requests have been gathered, a batch branch is created using the commits from each merge request in sequence. Any merge request that cannot be merged to this branch (e.g. due to a rebase conflict) is filtered out. A new merge request is then created for this branch, and tested in CI.
If CI passes, the original merge requests will be merged one by one.
If the batch job fails for any reason, we fall back to merging the first merge request, before attempting a new batch job.
Limitations
-
Currently we still add the tested-by trailer for each merge request's final commit in the batch, but it would probably be more correct to add the trailer only to the last commit in the whole batch request (since that's the only one we know passed for sure in that combination). We might change this in the future or make it configurable, but note that there's still a much stronger chance all intermittent final commits also passed then when just testing on each source branch, because we know the final linearization of all commits passes in that all MRs passed individually on their branches.
-
As trailers are added to the original merge requests only, their branches would need to be pushed to in order to reflect this change. This would trigger CI in each of the branches again that would have to be passed before merging, which effectively defeats the point of batching. To workaround this, the current implementation merges to the target branch through git, instead of the GitLab API. GitLab will detect the merge request as having been merged, and update the merge request status accordingly, regardless of whether it has passed CI. This does still mean the triggered CI jobs will be running even though the merge requests are merged. marge-bot will attempt to cancel these pipelines, although this doesn't work too effectively if external CI is used.
-
There is what can be considered to be a flaw in this implementation that could potentially result in a non-green master; consider the following situation:
- A batch merge request is created, and passes CI.
- Several merge requests are then merged to master, but one could fail (perhaps due to someone pushing directly to master in between).
- At this point, marge-bot will abort the batch job, resulting in a subset of the batch merge requests having been merged.
We've guaranteed that individually, each of these merge requests pass CI, and together with some extra merge requests they also pass CI, but this does not guarantee that the subset will. However, this would only happen in a rather convoluted situation that can be considered to be very rare.
Restricting the list of projects marge-bot considers
By default marge-bot will work on all projects that she is a member of.
Sometimes it is useful to restrict a specific instance of marge-bot to a subset
of projects. You can specify a regexp that projects must match (anchored at the
start of the string) with --project-regexp
.
One use-case is if you want to use different configurations (e.g.
--add-reviewers
on one project, but not the others). A simple way of doing is
run two instances of marge-bot passing --add-reviewers --project-regexp project/with_reviewers
to the first instance and --project-regexp (?!project/with_reviewers)
to the second ones. The latter regexp is a negative
look-ahead and will match any string not starting with project/with_reviewers
.
Restricting the list of branches marge-bot considers
It is also possible to restrict the branches marge-bot watches for incoming
merge requests. By default, marge-bot will process MRs targeted for any branch.
You may specify a regexp that target branches must match with --branch-regexp
.
This could be useful, if for instance, you wanted to set a regular freeze
interval on your master branches for releases. You could have one instance of
marge-bot with --embargo "Friday 1pm - Monday 9am" --branch-regexp master
and
the other with --branch-regexp (?!master)
. This would allow development to
continue on other branches during the embargo on master.
It is possible to restrict the source branches with --source-branch-regexp
.
Some handy git aliases
Only git bisect run
on commits that have passed CI (requires running marge-bot with --add-tested
):
git config --global alias.bisect-run-tested \
'f() { git bisect run /bin/sh -c "if !(git log -1 --format %B | fgrep -q \"Tested-by: Marge Bot\"); then exit 125; else "$@"; fi"; }; f'
E.g. git bisect-run-tested ./test-for-some-bug.sh
.
Revert a whole MR, in a rebase based workflow (requires running marge-bot with --add-part-of
):
git config --global alias.mr-revs '!f() { git log --grep "^Part-of.*/""$1"">" --pretty="%H"; }; f'
git config --global alias.mr-url '!f() { git log -1 --grep "^Part-of.*/""$1"">" --pretty="%b" | grep "^Part-of.*/""$1"">" | sed "s/.*<\\(.*\\)>/\\1/"; }; f'
git config --global alias.revert-mr '!f() { REVS=$(git mr-revs "$1"); URL="$(git mr-url "$1")"; git revert --no-commit $REVS; git commit -m "Revert <$URL>$(echo;echo; echo "$REVS" | xargs -I% echo "This reverts commit %.")"; }; f'
E.g. git revert-mr 123
. This will create a single commit reverting all commits
that are part of MR 123 with a commit message that looks like this:
Revert <http://gitlab.example.com/mygropup/myproject/merge_requests/123>
This reverts commit 86a3d35d9bc12e735efbf72f3e2fb895c0158713.
This reverts commit e862330a6df463e36137664f316c18b5836a4df7.
This reverts commit 0af5b70a98858c9509c895da2a673ebdb31e20b1.
E.g. git revert-mr 123
.
Troubleshooting
Marge-bot continuously logs what she is doing, so this is a good place to look
in case of issues. In addition, by passing the --debug
flag, additional info
such as REST requests and responses will be logged. When opening an issue,
please include a relevant section of the log, ideally ran with --debug
enabled.
The most common source of issues is the presence of git-hooks that reject
Marge-bot as a committer. These may have been explicitly installed by someone in
your organization or they may come from the project configuration. E.g., if you
are using Settings -> Repository -> Commit author's email
, you may need to
whitelist marge-bot
's email.
Some versions of GitLab are not good at reporting merge failures due to hooks
(the REST API may even claim the merge operation succeeded), you can find
this in gitlab-rails/githost.log
, under GitLab's logs directory.