napalm_install_config timeout with Arista HTTPS API
lvrfrc87 opened this issue · 9 comments
As per description, here the provider with tuned timers to increase timeout. Works fine with napalm_get_facts
. Please note I am running in virtualenv
eos_auth:
persistent_command_timeout: 180
persistent_connect_timeout : 180
timeout: 180
hostname: l1a.r5b1.ams7.nee.tmcs
username: username
dev_os: eos
password: password
optional_args:
port: 443
transport: https
Here the playbook:
- name: diff between running-conf/render and create backup of running-conf.
napalm_install_config:
config_file: './configurations/{{ inventory_hostname }}/renders/{{ config_file_name }}.cfg'
commit_changes: False
replace_config: False
get_diffs: True
archive_file: './configurations/{{ inventory_hostname }}/backups/{{ config_file_name }}.bak'
diff_file: './configurations/{{ inventory_hostname }}/diff/{{ config_file_name }}.diff'
provider: "{{ eos_auth }}"
And here the logs:
"archive_file": null,
"candidate_file": null,
"commit_changes": false,
"config": null,
"config_file": "./configurations/l1a.r5b1.ams7.nee.tmcs/renders/20200204_225450.cfg",
"dev_os": "eos",
"diff_file": "./configurations/l1a.r5b1.ams7.nee.tmcs/diff/20200204_225450.diff",
"get_diffs": true,
"hostname": "l1a.r5b1.ams7.nee.tmcs",
"optional_args": {
"port": 443,
"transport": "https"
},
"password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
"persistent_command_timeout": 180,
"persistent_connect_timeout": 180,
"provider": {
"dev_os": "eos",
"hostname": "l1a.r5b1.ams7.nee.tmcs",
"optional_args": {
"port": 443,
"transport": "https"
},
"password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
"persistent_command_timeout": 180,
"persistent_connect_timeout": 180,
"timeout": 180,
"username": "prd2204.svc"
},
"replace_config": false,
"timeout": 60,
"username": "prd2204.svc"
}
}
}
MSG:
cannot load config: Socket error during eAPI connection: The read operation timed out
Ansible 2.8.8
napalm 2.5.0
napalm-ansible 1.0.0
pyeapi 0.8.3
These won't be used at all and are not supported by napalm-ansible:
persistent_command_timeout: 180
persistent_connect_timeout : 180
You would have to look in the optional_args for EOS on NAPALM and see what NAPALM and PyEAPI support.
I did a quick look at the NAPALM and it looks like we support passing in generic PyEAPI arguments:
https://github.com/napalm-automation/napalm/blob/develop/napalm/eos/eos.py#L155
So it really is a question of what is supported by PyEAPI for in the context that we are using it.
@ktbyers Thanks for your reply. I' ll dig into PyEAPI library. I noticed though that napalm support
timeout=self.timeout,
Even though I set to 180 seconds, the connection get disconnected after 30 seconds sharp
If I am reading the code right, Arista Library and it supports timeout=60
arguments. This is might explain why the default timeout override of 30 seconds is override to 60 (?)
Saying that, even passing timeout=180
as optional_args
I still have 60 seconds of connection max
optional_args:
port: 80
transport: http
timeout: 180
I have also tried to use http and set the following in ansible.cfg
with no luck
[persistent_connection]
connect_timeout = 180
command_timeout = 180
I have manged to override the timeout adding timeout: 180
under the task.
After further troubleshooting, I can see that there is HTTPs connection, however packets are not sent/received.
User Requests Bytes in Bytes out Last hit
----------------- -------------- -------------- --------------- --------------
prd2204.svc 456 530905 3856176 12 seconds ago
User Requests Bytes in Bytes out Last hit
----------------- -------------- -------------- --------------- ---------------
prd2204.svc 456 530905 3856176 171 seconds ago
With the support of Arista (see arista-eosplus/pyeapi#184) seems that we found the issue:
"Napalm playbook worked with no issue until I enabled authorisation via TACACS+.
The way how the validation works in this case, every single command has to be validated at the TACACS+ server.
This means intensive communication between the switch and server and it takes a lot of time.
In my lab, in order to validate your configuration ansible spent 3 min 55 sec.
I guess your thoughts around napalm timeouts were correct, just needed a bit more tweaking.
To back up my theory, you can do a packet capture on the interface through which TACACS+ server is available and track the progress that way.
On the other hand, initial spin of your playbook executes in 23sec - I guess you might look into the options of either having local user just for napalm or disable authorisation for specific command type/users (if possible, although I didn't do a research on this one yet)."
However:
"Not sure what Napalm is doing in the background but doing config-replace with eos_config and other modules works just fine. This is what I am doing now and it works great. I will raise with Napalm guys and see what they say."
- name: render base.j2 template.
local_action: template src="base.j2" dest="./configurations/{{ inventory_hostname }}/renders/{{ config_file_name }}.cfg"
- name: diff the running-config against a master config.
eos_config:
diff_against: intended
intended_config: "{{ lookup('file', './configurations/{{ inventory_hostname }}/renders/{{ config_file_name }}.cfg') }}"
- name: backup running-config.
eos_config:
backup: yes
backup_options:
filename: "{{ config_file_name }}.bak"
dir_path: "./configurations/{{ inventory_hostname }}/backups/"
# TO DO - Delete old files in Arista
- name: copy via scp rendered file into Arista.
delegate_to: 127.0.0.1
command: scp -i ~/.ssh/id_rsa.pub ./configurations/{{ inventory_hostname }}/renders/{{ config_file_name }}.cfg prd2204.svc@{{ inventory_hostname }}:/tmp/
- name: config-replace.
eos_command:
commands:
- configure session {{ config_file_name }}
- configure replace file:/tmp/{{ config_file_name }}.cfg
- name: save config.
eos_config:
save_when: always
@FedericoOlivieri I guess I am not following--is there more to do here or are you just saying that 'AAA command authorization' broke the automation and there is no follow-up actions.
Yes, you could do something like the above using eos_config
(i.e. basically re-implement the NAPALM patterns but using the ansible-core modules in some way.
@ktbyers when I use config replace with NAPALM on Arista with aaa authorization commands
enabled, each single command sent by NAPALM is checked against AAA. So, a config replace takes 5 minutes or more. Same config replace using eos_config
with AAA enabled required just few seconds
@FedericoOlivieri Okay, I think it is it because they SCP the file and then load the SCP file as opposed to adding the commands into the configure session
directly (which is what we do).
I think the implication of this is that AAA would be bypassed in the file load mechanism? That is potentially a bit of a security issue on Arista's part... (i.e. it is fast because they are not actually evaluating the individual configuration commands and are thus bypassing your AAA). I guess bypass in a sense...you would still have to be authorized for the configure replace
from a file.
Anyways I don't think we would change that in NAPALM using Secure Copy generally causes a set of other issues and AAA-authorization is generally not used (though it is definitely not a fringe case either).
Is your set of configuration changes very large?
FWIW, using AAA-authorization is a big pain for automation (i.e. it will probably cause you meaningful automation pain in the long-run).
I do full config replace. We worked around disabling the authorization commands side of AAA