githubixx/ansible-role-wireguard

Execute lifecycle hooks upon reconfiguration to maintain consistency

Closed this issue · 4 comments

8ware commented

Managing iptables rules is a common example for using wg-quick's lifecycle hooks PreUp, PostUp, PreDown and PostDown. However, when changing those rules and deploying this role again they're not executed since just the plain wg configuration is sync'ed. What's worse, after updating the configuration WireGuard can't be even restarted with wg-quick since deleting a non-existing iptables rule results in an error which in turn aborts the shutdown halfway through.

Suppose the following PostUp and PreDown rules are deployed:

wireguard_postup:
  - iptables -A FORWARD -i %i -j ACCEPT
  - iptables -A FORWARD -o %i -j ACCEPT
  - iptables -t nat -A POSTROUTING -i %i -o eth0 -j MASQUERADE
# Generate inverse iptables commands
wireguard_predown: "{{ wireguard_postup | reverse | replace('-A', '-D') }}"

When I add a new rule (e. g. iptables -A INPUT -i % -j REJECT) to wireguard_postup and deploy the role again the configuration file contains two new lines: one PostUp and one PreDown entry (containing the new rule and its counterpart). When I try to restart WireGuard the counterpart of the new rule is already executed leading to an error.

On the otherhand, when I remove an iptables rule from wireguard_postup its counterpart is removed as well and won't be executed when restarting WireGuard leaving the system in an unclean state.

One way of resolving this issue is to shutdown the interface before updating the configuration file and starting it up again after the update. This comes with several challenges, though:

  • Active connections will be interrupted (which might not be that big of a problem in a private network environment)
  • To achieve idempotence we don't want to shutdown WireGuard if the configuration didn't change
  • In order to run the *Down rules of the old (currently active) configuration we cannot update the configuration before shutting down the interface

Since there might be even more side effects I don't anticipate I'd like to start a discussion on that topic. Thank you in advance.

8ware commented

Here are some approaches to the above mentioned challenges:

  • We shutdown WireGuard only if it's necessary, i. e. if a *Down rule has changed; in that case we probably want to restart WireGuard anyway
  • We use check_mode to determine whether the configuration has changed and diff to examine what in particular
  • We introduce a shutdown handler and flush it before actually updating the configuration

Here is a code snippet to illustrate my approach:

- ansible.builtin.template:
    src: etc/wireguard/wg.conf.j2
    dest: "{{ wireguard_remote_directory }}/{{ wireguard_interface }}.conf"
    owner: "{{ wireguard_conf_owner }}"
    group: "{{ wireguard_conf_group }}"
    mode: "{{ wireguard_conf_mode }}"
  check_mode: yes
  diff: yes
  register: wgc

- block:

    - ansible.utils.fact_diff:
        before: "{{ wgc.diff[0].before }}"
        after: "{{ wgc.diff[0].after }}"
      register: wgc_diff

    - ansible.builtin.debug:
        msg: Run Pre/PostDown hooks
      when: >
        wgc_diff.diff_lines|select("contains", "-PreDown")|length > 0
        or wgc_diff.diff_lines|select("contains", "-PostDown")|length > 0
      changed_when: yes
      notify: shutdown wireguard

  when: wgc is changed

- ansible.builtin.meta: flush_handlers

Though, this rises another issue: To examine the configuration's differences the no_log property cannot be used; otherwise the diff will be empty. Therefore, we either can't use that approach at all or need to handle log output carefully or we can respect the *Down rules only when using verbosity level above 2.

TBH I'm not sure if this is really something this role needs or should handle. This can become quite complicated and everybody wants it differently. Personally I would implement a Bash or Python script (or maybe two) and distribute it to the hosts with Ansible. Then this/these script(s) can be called with the hooks e.g.:

wireguard_postup:
  - /usr/local/bin/wg_postup.sh

wireguard_predown:
  - /usr/local/bin/wg_predown.sh
8ware commented

I understand that this functionality adds complexity but if this role is responsible for applying configuration changes it should do it consistently for all its aspects. Moving the logic to a separate script won't solve the problem.

The issue is about the lifecycle hooks not being executed upon reconfiguration which might lead to an unexpected state of the system. The issue is actually twofold: Without executing the Down hooks prior to any change the intended cleanup won't happen; likewise, without executing the Up hooks after applying the changes the system might not be configured as expected.

Both issues can be solved easily by shutting down WireGuard before and starting it again after applying the changes. By anticipating the configuration changes we can pick between the interrupting but consistent restart and the non-interrupting configuration sync.

Not perfect solution but wireguard_interface_restart variable solves part of it.