ansible-community/community-topics

Create guidelines for collection python requirements

samccann opened this issue · 8 comments

Summary

(copied from ansible/ansible-documentation#77. See that issue for some earlier discussion).

Summary

When packaging an ansible collection/role, there currently is no standard way of specifying python requirements, primarily the hard requirements of plugins. Some people use requirements.txt, but there is no standard guidelines for role and collection authors that I could find. It would be nice to have this documented and discussed somewhere.

Here are some considerations of possible standards

Using requirements.txt

Pros

  • Currently a popular approach
  • Does not leave a python footprint, e.g. a package name with .egg files

Cons

  • It is not standardized what are build/test/module requirements
  • Only python dependencies

Using python metadata

Pros

  • Current standard for python projects
  • Can specify separate build/test/module requirements, e.g.:
    • build-system.requires for any build requirements, e.g. in ansible-galaxy collection build and equivalents
    • project.dependenceis for local runtime requirements, i.e. plugin requirements
    • project.optional-dependencies.test, etc. for ansible test. Naming not standardized, but it is not that much of an issue
    • project.optional-dependencies.local-runner, project.optional-dependencies.moduleA, or some other format. For dependencies of running a particular runner or module. Does not have to be standardized, just mentioned as a good practice

Cons

  • Python footprint, e.g. must avoid name-clash. Not that severe since some projects already have a setup.py configuration and they use sensible namespacing
  • Only python dependencies

galaxy.yaml

Pros

  • Already standardized

Cons

  • Only handles ansible dependencies

This could be expanded though to include:

  • system_requirements: for system, packages, library and executable requirements
  • python_requirements: for python package requirements

Maybe we could use a similar format like PEP621 to separate them logically by usage

In documentation

Pros

  • Widely used for system requirements

Cons

  • Not standardized
  • Hard to separate system/python requirements and logical

Additional Information

No response

Quick in meeting note: we discussed adding a new standardized field for this to module/plugin documentation. One idea is to have a list of dictionaries for system packages and a list of dictionaries for python packages each with a name key and a when key that uses a subset of the conditionals that ansible uses (such as distribution or os_family).

You can find a lot of details in the meeting logs of https://meetbot.fedoraproject.org/ansible-community/2023-04-19/ansible_community_meeting.2023-04-19-18.00.html. I'll try to sum this up when I find time (but I'm not sure how long that is going to take until I have enough time :) ) Anyway, this is a topic I'm quite interested in :)

any update on this issue?

Not yet. I definitely have more time now (as the semantic markup changes are all done), I'll try to write that promised summary soon :)

@russoz and me discussed similar things already in ansible-collections/community.general#4512 a longer time ago. I am still remembering more discussions about this (before the meeting linked above), but these might have happened somewhere else than in that issue, or even just in my head ;-)

We have multiple things to solve:

  1. Where to store this information? For individual plugins and modules, they could be stored in DOCUMENTATION. (This also allows to use docs fragments to share requirements for module groups.) For whole collections, some file in meta/ would make sense. Potentially #154 (comment) and #154 (comment) would help here.

  2. How to store requirements.

    • A single requirements.txt style list of Python requirements and a single bindep.txt style file for system requirements would be a first idea, but can be pretty limiting - especially when Python libraries can also be installed as system libraries, but using them only makes sense in some cases (like matching Python versions).
    • One proposal from the meeting would be a list of dictionaries, processed from top to bottom until one matches, where each dictionary allows to specify a name, a condition, a list of Python requirements, and a list of system requirements. The condition should only be allowed to use a subset of conditions/facts that ansible-core provides (such as OS family, distribution, distribution version, Python version, ...).
    • I would allow having several of these lists, so that if you have independent dependencies you don't have to handle them both with the same set of conditionals. (Which could quickly lead to an explosion of combinations.)
  3. How much this will be supported by ansible-core and tooling.

    • There needs to be at least minimal support by ansible-core if this appears in DOCUMENTATION: allow the data to be present without resulting in a syntax error. If that's all that ansible-core's validation does, we need another validation tool.
    • antsibull-docs is already another tool that validates DOCUMENTATION (and it already does that more extensively than ansible-test's validate-modules, as it also validates documentation for test and filter plugins). It could also validate this information (and should validate it, at least to some extend, as it should also show it).
    • ansible-lint would be another candidate that could (and probably should) validate this.
  4. Who owns the specification format (in the sense that they are the only ones that can extend / modify it)? That also depends on who does the validation.

    • IMO this should be the community, represented by the Steering Committee.

I think the above summary includes everything discussed in the meeting.

Potential example:

  requirements:
    - name: cryptography library
      blocks:
        - name: RHEL, Fedora, etc
          when: ansible_facts.os_family == "RedHat"
          system:
            - python3-cryptography
        - name: Debian and Ubuntu
          when: ansible_facts.os_family == "Debian"
          system:
            - python3-cryptography
        - name: Debian and Ubuntu
          when: ansible_facts.os_family == "Debian"
          system:
            - python3-cryptography
        - name: Alpine
          when: ansible_facts.os_family == "Alpine"
          system:
            - py3-cryptography
        - name: Arch Linux
          when: ansible_facts.os_family == "Archlinux"
          system:
            - python-cryptography
        - name: FreeBSD
          when: ansible_facts.os_family == "FreeBSD"
          system:
           # QUESTION: should we allow templating of the requirements themselves? Or should we force to split this up into a long list of "Python version is 3.6: use py36-cryptography", "Python version is 3.7: use py37-cryptography", etc?
            - py{{ ansible_facts.python.version.major }}{{ ansible_facts.python.version.minor }}-cryptography
        - name: Everything else
          python:
            - cryptography
    - name: bcrypt library
      blocks:
        - name: PyPI
          python:
            - bcrypt >= 3.1.5
    - name: system tools for LUKS modules
      blocks:
        - name: Alpine
          system:
            - cryptsetup
            - device-mapper
            - lsblk
            - wipefs
        - name: Everything else except macOS and FreeBSD
          when: os_family != "Darwin" and os_family != "FreeBSD"
          system:
            - cryptsetup

This comes with a question for whether we want to make requirements templatable. (See the above FreeBSD cryptography example.)

Also there's the more general problem of using system packages vs. PyPI for cryptography, which depends on whether the system Python is used or not. So technically the example above is bad in this regard as it uses the system OS package ignoring whether the Python interpreter used matches the system Python.

I finally got around to implement a prototype for this in ansible-collections/community.general#7720. This is not meant to be merged (for starters, I don't think community.general is the best destination for such a plugin), but allows to showcase its capabilities a bit by adding requirements to some of the modules in the collection.

I used the following playbook to test it:

- hosts: servers
  gather_facts: false
  tasks:
    - name: Install for remote
      community.general.plugin_requirements_info:
        plugins:
          - name: ufw
          - name: community.general.java_cert
          - name: community.general.plugin_requirements_info
      register: requirements
    - debug:
        var: requirements

    - name: Fetch remote facts
      setup:
        gather_subset: min

    - name: Install for remote
      community.general.plugin_requirements_info:
        plugins:
          - name: ufw
          - name: community.general.java_cert
          - name: community.general.plugin_requirements_info
      register: requirements
    - debug:
        var: requirements

    - name: Install with delegate_to=localhost
      community.general.plugin_requirements_info:
        plugins:
          - name: ufw
          - name: community.general.java_cert
          - name: community.general.plugin_requirements_info
      delegate_to: localhost
      register: requirements
    - debug:
        var: requirements

    - name: Install on controller
      community.general.plugin_requirements_info:
        plugins:
          - name: ufw
          - name: community.general.java_cert
          - name: community.general.plugin_requirements_info
        modules_on_remote: false
      register: requirements
    - debug:
        var: requirements

    - name: Fetch local facts
      setup:
        gather_subset: min
      delegate_to: localhost
      delegate_facts: true
      run_once: true

    - name: Install on controller with cached facts
      community.general.plugin_requirements_info:
        plugins:
          - name: ufw
          - name: community.general.java_cert
          - name: community.general.plugin_requirements_info
        modules_on_remote: false
      register: requirements
    - debug:
        var: requirements

(I'm not sure how hacky the part with 'get facts for localhost when connection is not local' is. The case Install with delegate_to=localhost likely uses the wrong facts. But I'm out of time for today :) )

We're using the forum now for community discussions so closing this in favor of this forum topic - https://forum.ansible.com/t/create-guidelines-for-collection-python-requirements/5051

Please continue the discussion there.