debops/debops-playbooks

RFC: Drop role dependencies from roles

drybjed opened this issue · 22 comments

Some of you already know this but if not, @patrickheeney and other users call for better support for standalone role usage in DebOps, especially when users come from Ansible Galaxy. There is a way to do this, but it comes with a cost: role dependencies could be dropped from the roles themselves, and moved to the role plays (most of the roles have their own plays). The problem is, that for the DebOps playbook to work the same as right now, each play would need to have the complete copy of its dependency chain, preferably with all the variables from various roles used the same way.

Right now this dependency chain is resolved by Ansible itself and not really visible, moving it to the playbooks themselves would make it visible, but any changes would need to be propagated across all affected playbooks. After all roles are modified, from the user perspective playbook should work exactly the same as it works today (not 100% sure, but most likely). Custom playbooks and roles will also need to have full dependency chains present, otherwise roles will work partially and for example webserver won't have the needed firewall rules set if it hasn't been installed on a host previously.

It will take some time to rewrite all the roles and playbooks, and I would most likely do this after the playbooks are split into separate files (in the middle of the process right now). I'm inclined to do the conversion since individual roles will be more flexible and it will be easier to switch different services around, for example creating a set of playbooks that use apache2 instead of nginx as a webserver, but it will require manual maintenance of the dependency chain between various roles instead of relying on Ansible to do this for us.

Comments?

As you know, I full support this 👍.

I think this enables a lot more flexibility then just switching providers. You could also consider sunsetting the debops hooks as they could now be easily added before or after the task in everyones playbook. This enables more roles to be used directly from galaxy. It also has the ability to run everything faster since nginx for example runs once. It does not run once for every dependency. In my case I am installing 20 application roles so far, each one re-runs nginx and all dependencies. As you know, I ran into issues before with it detecting the default site since subsequent runs were contained in individual roles, which is now solved with this RFC.

As soon as you finish the playbook changes and tag it, can you update us with the gameplan? I would gladly pitch in on the conversion process.

My only concern is around playbooks loading all variables when using tags. For example if I run ansible-playbook -t nginx will it load secret vaiables when secret does not share the tag, but is used in the playbook? This could be tested easily enough.

To solve the problem run speed of the role several times.
You can use this code, but you must take into account the nuances.

- name: init test
  include: init.yml
  when: role_test_register_init is not defined

- name: init test register
  command: /bin/true
  register: role_test_register_init
  when: role_test_register_init is not defined
  changed_when: False

@le9i0nx OK, but that registered variable vanishes on the next run. I imagine you could save it as an Ansible local fact so that it's checked and reused on the next run, but... The main point of running full Ansible playbook from time to time is that it can ensure that settings you applied are still in place. When you use Ansible local facts to prevent Ansible from running the tasks, you lose that option as long as local facts are stored on remote host.

ypid commented

I am partly against this RFC. Most roles (at least the once I know) have a variable like owncloud_dependencies which allows you to disable the role dependencies and manage them manually. So the wanted behavior can already be achieved without a modification of all roles and playbooks.

The way I understand the DebOps project is to have a set of high-quality, well-documented roles and playbooks which are intended to work together. I know that it can be difficult if one does not already use the common roles and just wants to run one role to setup a service. But as said by @drybjed, there is a trade-off between users how just want one role right now and user which use the whole DebOps role stack. I think disabling role dependencies with a variable like owncloud_dependencies should be acceptable to users of the first category.

In my case I am installing 20 application roles so far, each one re-runs nginx and all dependencies.

Good point. I guess there is still room for improvement but I would propose a different solution. Why not check in the roles if it is currently used as dependencies (by checking for example ferm_input_dependent_list) and disable all the basic setup (maybe with one assert to ensure that the role has been fully run before). So the debops.nginx role could just setup the configuration and reload nginx in that case.

The thing about switching between alternatives like nginx or Apache could be solved by a "virtual role" (like virtual packages in Debian) or with alternatives in the role dependencies itself. The later one is already common practice in the DebOps project as it appears to me. In the long run "virtual roles" are probably easier to maintain.

That said. I think we need to differentiate two types of roles commonly used as role dependencies:

  1. Roles that ensure that a certain dependency is installed. Examples: Programming languages
  2. Roles which get variables from the "parent role" when used as role dependencies. Common examples: debops.nginx, debops.apt_preferences.

The first type could be eliminated somehow (somehow run it only once in a play, not sure how to do this), the later one should stay (possibly with improvements as mentioned).

Are there examples of how such a playbook would look like without role dependencies?

To be honest the whole idea of ditching the Ansible role dependencies and moving everything to playbook level is appealing to me more and more. In the new way, there are no "soft" and "hard" role dependencies - there are just "hard" ones, like debops.secret which is designed to be used as a dependency. I think that some other specific roles could be designated as "hard" dependencies, for example debops.apt_preferences is hard when your application needs a package from the Backports repository to work at all.

Other kinds of role dependencies are optional, and solutions used in them could be swapped at will (as long as we swap a role in all playbooks, that is, to avoid conflicts). Current role dependency system in Ansible is very rigid for this, and it requires for all involved roles to be run, even if they are skipped, as well as needs all of the roles to be installed to even allow playbooks to run. Ditching that system for pure playbook-based dependency chain might be better in the long run, and it allows for more granular system, which I like.

Here's a quick example of how this change would look like. Current system looks something like this:

# debops.secret - no dependencies
# debops.apt_preferences - no dependencies
# debops.ferm - no dependencies
# debops.php5 - no dependencies

# debops.mariadb/meta/main.yml
dependencies:
  - role: debops.secret

# debops.nginx/meta/main.yml
---
dependencies:
  - role: debops.secret
  - role: debops.apt_preferences
    apt_preferences_dependent_list: '{{ nginx_preferences1 }}'
  - role: debops.apt_preferences
    apt_preferences_dependent_list: '{{ nginx_preferences2 }}'
  - role: debops.ferm
    ferm_dependent_rules: '{{ nginx_ferm_rules }}'

# debops.librenms/meta/main.yml
---
dependencies:
  - role: debops.secret
  - role: debops.php5
    php5_config: '{{ librenms_php5_config }}'
  - role: debops.mariadb
    mariadb_config: '{{ librenms_php5_config }}'
  - role: debops.nginx
    nginx_config: '{{ librenms_nginx_config }}'

# playbook: librenms.yml
---
- hosts: debops_librenms
  roles:
    - role: debops.librenms

In the new system, it would all shift "downwards" towards the playbook:

# debops.secret - no dependencies
# debops.apt_preferences - no dependencies
# debops.ferm - no dependencies
# debops.php5 - no dependencies

# debops.mariadb/meta/main.yml
dependencies:
  - role: debops.secret

# debops.nginx/meta/main.yml
---
dependencies:
  - role: debops.secret
  - role: debops.apt_preferences
    apt_preferences_dependent_list: '{{ nginx_preferences1 }}'
  - role: debops.apt_preferences
    apt_preferences_dependent_list: '{{ nginx_preferences2 }}'

# debops.librenms/meta/main.yml
---
dependencies:
  - role: debops.secret

# playbook: librenms.yml
---
- hosts: debops_librenms
  roles:
    - role: debops.ferm
      ferm_dependent_rules: '{{ nginx_ferm_rules }}'
    - role: debops.php5
      php5_config: '{{ librenms_php5_config }}'
    - role: debops.mariadb
      mariadb_config: '{{ librenms_mariadb_config }}'
    - role: debops.nginx
      nginx_config: '{{ librenms_nginx_config }}'
    - role: debops.librenms

As you can see, when we take into account that dependent roles are supposed to be run before their "parent roles", from Ansible's point of view nothing really changes, order is preserved, all the handlers during a play should work the same, etc.

The one issue I see right now is that the defaults/main.yml file in various roles can become even longer and hard to maintain; let's hope that Ansible team can come up with a way to split it into smaller files similar to how inventory can be used currently.

ypid commented

The one issue I see right now is that the defaults/main.yml file in various roles can become even longer and hard to maintain

That would also be my question, because with this change librenms_nginx_config, librenms_mariadb_config and so on would have to be defined in the inventory and could not be defined in the role itself

@ypid The way I probably will do it, is that I will put librenms_nginx_config, librenms_mariadb_config, etc. in role default variables. These variables already exists, but are set either in meta/main.yml or vars/main.yml and are basically hardcoded.

After that you will be able to either use those variables in the playbook as configuration for a specific role, or provide your own configuration instead... I just need to check if variables defined in inventory will override variables defined in a playbook... That might be an issue.

IMHO, one of the major points for using debops is the ease of setup. I'm fine with moving dependencies into the playbooks, but users need to get much help on resolving the dependencies. E.g. every role should check of the dependencies are installed and configured. If not, print a meaningful error message.

The one issue I see right now is that the defaults/main.yml file in various roles can become even longer and hard to maintain

Can't this file be split up?

ypid commented

Sorry for the confusion. I was not sure if the following would work:

mkdir testing.first/tasks testing.second/defaults/ testing.second/tasks -p
touch testing.first/tasks/main.yml testing.second/tasks/main.yml
echo -e '---\ndef_in_second: 23' > testing.second/defaults/main.yml
echo -e '---\n- debug: var=def_in_second' > testing.first/tasks/main.yml
    - role: testing.first
    - role: testing.second

But it does:

TASK: [testing.first | debug var=def_in_second] ******************************* 
ok: [localhost] => {
    "var": {
        "def_in_second": "23"
    }
}

So I guess it can be done. Defining it all in defaults/ is not bad because then it can all be changed/overwritten from the inventory.

@htgoebel I'm not sure about checking everything in the roles, for example should a database role check for correct iptables rules? That adds additional tasks and complexity, I would hope that users could look at existing playbooks to see what roles are expected to be used, and provide similar set of roles in their own playbooks.

At the moment defaults/main.yml cannot be split into separate files. I've talked with some Ansible developers, there's an idea to allow multiple subdirectory levels and separate files in defaults/ similar to how the Ansible inventory works (code is already in place and tested), but that will most likely be considered after Ansible v2 is released.

ypid commented

In my case I am installing 20 application roles so far, each one re-runs nginx and all dependencies.

How would this be handled with the new design?

# playbook: librenms.yml
---
- hosts: debops_librenms
  roles:
    - role: debops.ferm
      ferm_dependent_rules: '{{ nginx_ferm_rules }}'
    - role: debops.php5
      php5_config: '{{ librenms_php5_config }}'
    - role: debops.mariadb
      mariadb_config: '{{ librenms_mariadb_config }}'
    - role: debops.nginx
      nginx_config: '{{ librenms_nginx_config }}'
    - role: debops.librenms

@ypid Psydocode as I don't recall all the variables. With example below you could add hundreds of sites easily while referencing variables from the site_one_app role. I would likely define the list of mariadb databases and nginx conf in inventory and not in the playbook, but for example purposes I have included it here. This would run each role just one time, which has been extremely performant in my testing.

# playbook: applications.yml
---
- hosts: debops_librenms
  roles:
    - role: debops.ferm
      ferm_dependent_rules: '{{ nginx_ferm_rules }}'
    - role: debops.php5
      php5_config: '{{ librenms_php5_config }}'
    - role: debops.mariadb
      mariadb_databases:
         - '{{ site_one_database }}'
    - role: debops.nginx
      nginx_conf:
         - '{{ site_one_nginx_conf }}'
    - role: debops.site_one_app

@ypid In my opinion this makes the inventory a lot more explicit. I can look in one place to see all of the databases and conf that will be installed on a host. Previously, I would have to read each roles dependency section and trace the variables to determine what will be installed. If I want to turn off that functionality then I have to set a variable to opt-out of all dependencies (even if I only wanted to opt out of just one database table). This enables you to easily add/edit/delete all of the configuration on a host.

I think that to make the use of multiple entries easier, each configuration entry should be defined as a dict variable in defaults/main.yml, so that it can then be used in a list in the playbook. An example ferm configuration from nginx role would looke like this:

# debops.nginx/defaults/main.yml

nginx_ferm_rules:
  type: 'dport_accept'
  dport: [ 'http', 'https' ]
  saddr: '{{ nginx_allow + nginx_group_allow + nginx_host_allow }}'
  accept_any: True
  filename: 'nginx_dependency_accept'
  weight: '20'          

Then it could be used among other configuration dictionaries like this:

- hosts: webserver
  roles:

    - role: debops.ferm
      ferm_dependent_rules:
        - '{{ nginx_ferm_rules }}'
        - '{{ mariadb_server_ferm_rules }}'

    - role: debops.nginx
      nginx_servers:
        - '{{ nginx_server_domain_one }}'
        - '{{ nginx_server_domain_two }}'

And so on, and so on. It gives more wiight on the playbooks side, but roles are more flexible and granular.

@drybjed Exactly. Your example was much better than mine. It also has all of the added benefits I mentioned in my last couple of posts.

Also, one potential idea to automate the collection of some of these variables if you really did not want to maintain them manually and be explicit would be to create a simple python module. It could act like this:

# playbook.yml
- hosts: webserver
  roles:
    - role: debops.ferm
      ferm_dependent_rules: '{{ collect('ferm_rules') }}'
    - role: debops.nginx
      nginx_servers: '{{ collect('nginx_servers') }}'

This could loop through each role directory and look for any variables with the prefix or suffix of ferm_rules or nginx_servers and add them to a list. By using this in your playbook instead of a role, the dependency is not forced upon debops itself but used as a handy shortcut if needed.

I would prefer things were explicit though, but this does potentially offer some magic for those who want it.

Something that occurred to me today. This RFC makes it easier to lock to a specific version and handle debops updates. Because the dependencies are not hard-coded in meta/main.yml you can specify them in your own requirements files and even work with multiple versions of a single role in different playbooks. By following semver, you can read the upgrade guide, update your dependency version, and re-run galaxy for example. Right now I don't update, because updating nginx for example, uses the latest versions of all dependencies, which I may have not updated, thus breaking things.

ganto commented

With regard to the non-debops users, I think the idea is quite interesting.

One thing that I'm a bit worried of is the point that @patrickheeney describes. Although, if you know what you are doing, it makes it easier to specify individual requirements for a role, it also introduces a new dependency of the individual roles to the playbooks repository.

As long as the playbooks are stored separate from the roles, it will introduce a problem in the following case. There is an update to role A introducing a dependency change. Same for another role B. If the user now only wants to update role B, it will get a new playbook for role A, which might fail in various ugly ways. Especially when thinking of the ongoing mysql to mariadb migration or the discussion about virtual roles, this is a bit of a concern for me.

I would welcome, if we could make it possible to add a playbook lookup mechanism and ship an example playbook with the role. E.g.

roles/debops.nginx/defaults/...
                   meta/...
                   playbooks/nginx.yml

In the DebOps playbooks there would be something like this:

# playbook.yml
- include: '{{ lookup("playbook", "nginx" }}'

This would then check all the (role) playbook directories for a matching nginx.yml playbook.

I was toying with the idea of adding simple playbook to each role directory which could then be somehow included in main playbook, but this would require writing another required plugin which would look up the roles in directories from a list, pick the first one found, and include the playbook... Which essentially duplicates Ansible role dependencies functionality. I suppose this lookup plugin could return empty if role was not found and this would allow for some kind of "optional" roles, but this again moves functionality out of core Ansible and forces DebOps to maintain the code. It also makes the whole setup harder to understand - where is that "nginx" role coming from?

And you still would need to add an include line in your own playbooks, so that doesn't change anything. Using the Ansible role functionality is cleaner, IMO.

ypid commented

I am currently thinking about solving this issue for debops.owncloud. I am not sure if I can drop the nginx and php dependencies from the role because some features of the role (autodeploy) depend on it and will probably break the test framework for the role when I remove them. Any ideas?

Test framework can use custom playbooks and install Ansible Galaxy roles if they are not included in the role dependencies, so if you have to remove some, go ahead. The tests can be updated as needed.

ypid commented

Status: This RFC is used for new roles and existing roles are migrated over time. @drybjed Want to add an approved label like it is used for debops/docs?