Playbook fails on Centos 8 Streams - ERROR! couldn't resolve module/action 'virt_net'.
rbo opened this issue ยท 27 comments
ansible-playbook ./ansible/setup.yml
[DEPRECATION WARNING]: [defaults]callback_whitelist option, normalizing names to new standard, use callbacks_enabled instead. This feature will be removed
from ansible-core in version 2.15. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
[DEPRECATION WARNING]: "include" is deprecated, use include_tasks/import_tasks instead. This feature will be removed in version 2.16. Deprecation warnings
can be disabled by setting deprecation_warnings=False in ansible.cfg.
ERROR! couldn't resolve module/action 'virt_net'. This often indicates a misspelling, missing collection, or incorrect module path.
The error appears to be in '/root/hetzner-ocp4/ansible/roles/openshift-4-cluster/tasks/create-network.yml': line 98, column 3, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
- name: Define network {{ cluster_name }}
^ here
We could be wrong, but this one looks like it might be an issue with
missing quotes. Always quote template expression brackets when they
start a value. For instance:
with_items:
- {{ foo }}
Should be written as:
with_items:
- "{{ foo }}"
Installed a bunch of ansible-galaxy modules:
ansible-galaxy collection install community.libvirt
ansible-galaxy collection install community.crypto
ansible-galaxy collection install community.general
ansible-galaxy collection install community.aws
ansible-galaxy collection install google.cloud
ansible-galaxy collection install community.azure
ansible-galaxy collection install kubernetes.core
Now fails with:
TASK [openshift-4-cluster : Define network ocp4] *******************************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "The `libvirt` module is not importable. Check the requirements."}
Ansible can not load the libvirt python module by default
[root@ocp4 hetzner-ocp4]# ansible localhost -m virt -a command=list_vms -e 'ansible_python_interpreter=/usr/bin/python3'
[DEPRECATION WARNING]: [defaults]callback_whitelist option, normalizing names to new standard, use callbacks_enabled instead. This
feature will be removed from ansible-core in version 2.15. Deprecation warnings can be disabled by setting deprecation_warnings=False
in ansible.cfg.
[WARNING]: Skipping callback plugin 'profile_tasks', unable to load
localhost | SUCCESS => {
"changed": false,
"list_vms": []
}
[root@ocp4 hetzner-ocp4]# ansible localhost -m virt -a command=list_vms
[DEPRECATION WARNING]: [defaults]callback_whitelist option, normalizing names to new standard, use callbacks_enabled instead. This
feature will be removed from ansible-core in version 2.15. Deprecation warnings can be disabled by setting deprecation_warnings=False
in ansible.cfg.
[WARNING]: Skipping callback plugin 'profile_tasks', unable to load
localhost | FAILED! => {
"changed": false,
"msg": "The `libvirt` module is not importable. Check the requirements."
}
[root@ocp4 hetzner-ocp4]#
Ansible use /usr/bin/python3.8
and not the system default:
[root@ocp4 hetzner-ocp4]# /usr/bin/python3.8 -c "import libvirt"
Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'libvirt'
[root@ocp4 hetzner-ocp4]# /usr/bin/python3.6 -c "import libvirt"
[root@ocp4 hetzner-ocp4]#
Hot fix
Install modules:
ansible-galaxy collection install community.libvirt
ansible-galaxy collection install community.crypto
ansible-galaxy collection install community.general
ansible-galaxy collection install community.aws
ansible-galaxy collection install google.cloud
ansible-galaxy collection install community.azure
ansible-galaxy collection install kubernetes.core
Configure ansible_python_interpreter in your cluster.yml
Add
# Hot fix for https://github.com/RedHat-EMEA-SSA-Team/hetzner-ocp4/issues/205
ansible_python_interpreter: /usr/libexec/platform-python
to your cluster.yml
Next problem:
TASK [openshift-4-cluster : Select cluster & user] **************************************
fatal: [localhost]: FAILED! => {"msg": "You need to install \"jmespath\" prior to running json_query filter"}
I have found a workaround using Jinja's selectattr filter instead of JMESPath's json_query filter, like follows
- name: Select cluster & user
set_fact:
cluster: "{{ kubeconfig.clusters | selectattr('name','equalto','ocp4') | map(attribute='cluster') | first }}"
user: "{{ kubeconfig.users | selectattr('name','equalto','admin') | map(attribute='user') | first }}"
Next problem:
TASK [openshift-4-cluster : Create infra-registry pv] ***************************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "kubernetes >= 12.0.0 is required"}
Fixed by:
# pip3 install -I kubernetes openshift
# pip3 list | grep kubernetes
kubernetes (22.6.0)
and the cluster is up and running
@snoussi thanks look good.
I also investigate to use an ansible execution environment: https://github.com/RedHat-EMEA-SSA-Team/hetzner-ocp4/tree/ansible-ee first test looks good:
$ podman run --rm -ti --security-opt label=disable -v /run/libvirt/:/run/libvirt/ \
-v /var/run/libvirt/:/var/run/libvirt/ \
quay.io/redhat-emea-ssa-team/hetzner-ocp4-ansible-ee:devel bash
bash-4.4# virsh list
Id Name State
--------------------------------
2 demo-master-0 running
3 demo-master-1 running
4 demo-master-2 running
5 demo-compute-0 running
6 demo-compute-1 running
bash-4.4# ansible localhost -m virt -a command=list_vms
[WARNING]: No inventory was parsed, only implicit localhost is available
localhost | SUCCESS => {
"changed": false,
"list_vms": [
"demo-master-0",
"demo-master-1",
"demo-compute-1",
"demo-compute-0",
"demo-master-2"
]
}
bash-4.4#
Long term goal might be to use ansible-navigator too. let's see...
@snoussi
Thanks a lot for the solution.
It works on my Hetzner-Server as well. Cool stuff. ๐
Also failing on the firewalld module, missing from ansible-core:
TASK [openshift-4-cluster : Include OS specific part] ********************************************************************************************************
[DEPRECATION WARNING]: "include" is deprecated, use include_tasks/import_tasks/import_playbook instead. This feature will be removed in version 2.16.
Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
fatal: [localhost]: FAILED! => {"reason": "couldn't resolve module/action 'firewalld'. This often indicates a misspelling, missing collection, or incorrect mo
dule path.\n\nThe error appears to be in '/home/manu/hetzner-ocp4/ansible/roles/openshift-4-cluster/tasks/prepare-host-CentOS-8.yml': line 38, column 3, but m
ay\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Allow NFS traffic from VM's to Host\n ^
here\n"}
was fixed with
ansible-galaxy collection install ansible.posix
The workaround for selectattr
is not completely correct is it hardcodes the cluster name.
Correct one should look like:
cluster: "{{ kubeconfig.clusters | selectattr('name','equalto',cluster_name) | map(attribute='cluster') | first }}"
user: "{{ kubeconfig.users | selectattr('name','equalto','admin') | map(attribute='user') | first }}"
Please get me feedback if the ansible execution environment works for you!
So I tried with the ansible execution env using latest devel two remarks:
- we need to add a special flag if the playbook is to be run on the hettzer server itself as a root user ( seems to be the recommeded option according to readme ? )
ansible-navigator run -m stdout ./ansible/setup.yml --connection=local
otherwise we get the error
TASK [Gathering Facts] *********************************************************
fatal: [host]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: root@localhost: Permission denied (publickey).", "unreachable": true}
- later on firewalld is failing on with
TASK [openshift-4-cluster : Enable & Start firewalld] **************************
fatal: [host]: FAILED! => {"changed": false, "msg": "Could not find the requested service firewalld: host"}
@EmmanuelKasper thanks for testing. Do to you configure ssh properly as mentioned in the documentation: https://github.com/RedHat-EMEA-SSA-Team/hetzner-ocp4/tree/devel#initialize-tools
The --connection: local
is interesting but I changed the whole behaviour that it works over an ssh connection. This means the ansible-ee connects to host
which is basically localhost: https://github.com/RedHat-EMEA-SSA-Team/hetzner-ocp4/blob/devel/inventory/hosts.yaml
With this change, it is technically possible to run the playbooks against a remote host.
Hope it helps, if not feel free to ping me directly.
I also have this issue on RHEL 8.6.
In the process of testing these workarounds on RHEL 8.6. So far so good.
Yes, I can confirm that by downgrading ansible components and applying the fixes here my RHEL 8.6 box is perfectly happy to deliver my test OCP clusters without any issues at all. So it is really not just a Centos stream issue.
@tomazb thanks, yes we have also a couple of problems on rhel 8, that's why I introduced ansible execution environment to hetzner-ocp4. I assume I will merge ansible execution environment into master next couple days/weeks.
I was able to install a cluster on a RHEL 8.6 box by using the latest devel-branch with some additional modifications on the virt-modules usage. Using the ansible execution-environment (EE) works perfect for me now, but only as a standard-user. Running the EE as root is failing.
I was able to install a cluster on a RHEL 8.6 box by using the latest devel-branch with some additional modifications on the virt-modules usage. Using the ansible execution-environment (EE) works perfect for me now, but only as a standard-user. Running the EE as root is failing.
It looks like this is related to AAP 2.2 : #220
On my fresh installed 8.6 & AAP 2.1 it works very well.
Fresh CentOS 8 Stream installation in Hetzner. Had to do next steps:
ansible-galaxy collection install community.libvirt
ansible-galaxy collection install community.crypto
ansible-galaxy collection install community.general
ansible-galaxy collection install community.aws
ansible-galaxy collection install google.cloud
ansible-galaxy collection install community.azure
ansible-galaxy collection install kubernetes.core
#add cluster.yml
ansible_python_interpreter: /usr/libexec/platform-python
pip3 install -I kubernetes openshift
pip3 install boto3
#hetzner-ocp4/ansible/roles/openshift-4-cluster/tasks/build-k8s-vars.yml
#Select cluster & user -task
cluster: "{{ kubeconfig.clusters | selectattr('name','equalto',cluster_name) | map(attribute='cluster') | first }}"
user: "{{ kubeconfig.users | selectattr('name','equalto','admin') | map(attribute='user') | first }}"
Next problem:
TASK [openshift-4-cluster : Select cluster & user] ************************************** fatal: [localhost]: FAILED! => {"msg": "You need to install \"jmespath\" prior to running json_query filter"}
to fix this I have installed the missing library with the following command:
/usr/bin/pip3.8 install -I jmespath
There are a lot of issue coming from mixing different versions/users when installing modules.
I solved this by forcing the python interpreter for the execution and making sure all modules are installed for the right user (jmespath, libvirt-python that is required for libvirt ansible module, etc).
MIgrating to an EE mitigates for what concerns the 'local' modules, libvirt-python must be enabled on the VM host itself and not on the EE.
@suulperi do you tried the ansible ee in devel tree? #205 (comment)
@rbo No didn't. I was way too busy because of demo session. I will try it as soon as possible.
Ansible execution env. & ansible-navigator changes merged into master with PR #212
The issue is solved with the new solution based on ansible-navigator. Please checkout new usage:
New usage:
- Install ansible navigator & configure ssh
- Run playbooks:
ansible-navigator run -m stdout ./ansible/setup.yml