Add parameter to ignore file path patterns to avoid saving matched files
Closed this issue · 19 comments
The callback will retrieve files that are loaded by Ansible throughout the execution of the playbook. This could contain files that users might not want to save, either because they are sensitive or are simply not relevant.
The implementation would be somewhat similar to that we already do with ignored_arguments
and ignored_facts
for ignoring specified playbook parameters and host facts.
I was looking at something similar. We are using a "site-configuration" that holds specific private variables. I would ideally be able to ignore that file during the upload.
I can take a crack at implementing the feature.
@LaurentDumont it would be nice to implement this indeed. Happy to review patches !
Patterning off of ignored_arguments and ignored_facts should work.. there's:
- built-in docs
- getting parameters from the docs (ansible config doc parsing magic)
- the _get_or_create_file method where the files get created.
I've started a quick fix for our use case - https://review.opendev.org/#/c/703276/
- It doesn't allow for pattern matching but our current need is limited to ignoring specific unique file names.
- A large majority of the Ansible files are probably going to be named
main.yml
. It could be a good idea to allow ignoring based on file path (or a keyword within the file path). - Zuul failed once but the two errors we're somewhat nebulous and not related to my changes. I've re-run the CI and all tests passed.
Let me know if it's an okay start! I've tested locally by modifying the callback directly and I can see the file content being sanitized.
@LaurentDumont thanks for the patch !
A large majority of the Ansible files are probably going to be named main.yml. It could be a good idea to allow ignoring based on file path (or a keyword within the file path).
Yes, I think what we actually want are patterns instead of absolute file names because in the end, absolute file names can also be specified as patterns.
For example, a pattern of main.yml
would match all the main.yml files but main.y[a]?ml
would also match main.yaml.
If the users aren't interested in saving group_vars, they could specify a group_vars
pattern which would prevent saving any file path where "group_vars" is present.
There are some approaches for matching strings against a list of regexes on stack overflow that could be worth taking a look into.
Zuul failed once but the two errors we're somewhat nebulous and not related to my changes. I've re-run the CI and all tests passed.
It happens ¯\(ツ)/¯
@LaurentDumont I was thinking we'd probably want to have some minimal testing to demonstrate that we really aren't saving files that are filtered.
I have been tinkering with an approach and was wondering what you thought about it ?
It looks like this in practice and the patch for the implementation is here.
I think users could come up with their own use cases for it but I'd use it for integration testing ara itself.
For example, we could have a test where ARA would normally include a "secrets.yml" file, assert that it's there but actually empty:
- name: Retrieve the current playbook
ara_playbook:
register: playbook_query
- name: Recover secret file from ARA
vars:
playbook_id: "{{ playbook_query.playbook.id | string }}"
query: "?playbook={{ playbook_id }}&path=secrets.yml"
set_fact:
file: "{{ lookup('ara', '/api/v1/files/' + query) }}"
# TODO: This wouldn't actually work because the file list view doesn't have the content detail
- name: Assert file
assert:
that:
- file.content == "Not saved by ARA as configured by 'ignored_files'"
I like that! That could be a good test to have for the feature. I'll work on integrating the regex matching for the full file path. It's going to be more flexible than just the file name.
Before I forget, what is the right way to actually let the callback output the debug? I'm seeing self.log.debug("v2_playbook_on_start")
in the callback but using -v
on the playbook doesn't seem to let it output.
Nevermind!
Looks like adding logging.basicConfig(level=logging.DEBUG)
to the ara_default.py works to let the debug stuff output.
Before I forget, what is the right way to actually let the callback output the debug? I'm seeingself.log.debug("v2_playbook_on_start")
in the callback but using-v
on the playbook doesn't seem to let it output.Nevermind!
Looks like adding
logging.basicConfig(level=logging.DEBUG)
to the ara_default.py works to let the debug stuff output.
export ARA_LOG_LEVEL=DEBUG
should do it.
edit: There's also export ARA_DEBUG=true
for the Django bits.
So, I've been playing with this for a little bit and I'm a bit stuck. It looks like the behavior with escaping string/special characters and dealing with full path (with multiple "/") gets quirky and messy quickly.
I was looking at the behavior of getting the variables for the ansible.cfg file and I just wanted to confirm that the following was expected.
This is when I'm printing the value of the variable itself.
DEBUG:ara.plugins.callback.default:Show ignored_arguments
['\'["extra_vars"', '"vault_password_files"]\'']
DEBUG:ara.plugins.callback.default:Show ignored_facts
['\'["ansible_env"', '"ansible_all_ipv4_addresses"]\'']
DEBUG:ara.plugins.callback.default:Show ignored_files
['\'["test_ignored_file_1"', '"test_ignored_file_2"]\'']
#Print each value of the list ignored_files
'["test_ignored_file_1"
"test_ignored_file_2"]'
It looks like the "final" value includes the brackets and the single quotes. It also escapes the outer single quote. I am missing something in the way the list is built from the .ini value?
It looks like this
ignored_facts = '["ansible_env", "ansible_all_ipv4_addresses"]'
ignored_arguments = '["extra_vars", "vault_password_files"]'
ignored_files = test_ignored_file_1, test_ignored_file_2, *.test.*
Also works and outputs the variables without all the extra stuff around.
Are you looking at the values before or after the get_option ? Ansible should parse an "ini-style" list into a proper python list we can work with. Are you printing them as strings ? Using json.dumps ? Something else ?
That said, specifying lists in ini-style format (or in a format appropriate for an environment variable export) is not really great from a UX perspective -- case in point: the issue you had raised about allowed_hosts.
I need to think a little bit about this.
I'm just dumping the values after the get_option. For the escape stuff, it looks like you need repr
to get printed as the raw value. Printing with a loop seems to work correctly without repr
...
self.log.debug("Show ignored_arguments")
print(self.ignored_arguments)
self.log.debug("Show ignored_facts")
print(repr(self.ignored_facts))
self.log.debug("Show ignored_files")
print(repr(self.ignored_files))
for x in self.ignored_files:
print(x)
for x in self.ignored_arguments:
print(x)
Output
DEBUG:ara.plugins.callback.default:Show ignored_arguments
['\'["extra_vars"', '"vault_password_files"]\'']
DEBUG:ara.plugins.callback.default:Show ignored_facts
['\'["ansible_env"', '"ansible_all_ipv4_addresses"]\'']
DEBUG:ara.plugins.callback.default:Show ignored_files
['["test_ignored_file_1"', '"test_ignored_file_2"]']
["test_ignored_file_1"
"test_ignored_file_2"]
'["extra_vars"
"vault_password_files"]'
Maybe limiting to the file name would be a good start. I'm looking at our playbooks and I don't see a case where the regex would win over just the string matching technique.
@LaurentDumont you're right, maybe we don't need regexes. I added a comment in the patch, let me know what you think ?
@LaurentDumont you're right, maybe we don't need regexes. I added a comment in the patch, let me know what you think ?
That's good for me! I've added a patch. It works from what I've been able to test. I'll see about the integration tests as well.
That said, I'm still unsure about the proper format in ansible.cfg
. The only way I could make it work is by using the following format. I've also noticed that that the ignored_facts
didn't seem to work on my side.
[ara]
api_client = http
api_server = http://127.0.0.1:8000
ignored_facts = '["ansible_env", "ansible_all_ipv4_addresses"]'
ignored_arguments = '["extra_vars", "vault_password_files"]'
ignored_files = test_ignored_file_1, test_ignored_file_2, create_domains.yml
I'm going to try a fresh install somewhere just to make it isn't something local on my side.
@LaurentDumont the list format stuff is weird and confusing. Sorry about that.
There is also a distinction between how dynaconf parses things for the API server and how Ansible parses things for the callback plugin.
I've bumped into this when implementing a default label feature and ended up doing it this way:
ara/ara/plugins/callback/ara_default.py
Lines 222 to 230 in a8b8ff5
We should probably go back and do something similar for the other Ansible callback settings.
Before I forget, what is the right way to actually let the callback output the debug? I'm seeingself.log.debug("v2_playbook_on_start")
in the callback but using-v
on the playbook doesn't seem to let it output.Nevermind!
Looks like adding
logging.basicConfig(level=logging.DEBUG)
to the ara_default.py works to let the debug stuff output.
btw I realized that this is an issue and I've created one for it: #112
@LaurentDumont the list format stuff is weird and confusing. Sorry about that.
There is also a distinction between how dynaconf parses things for the API server and how Ansible parses things for the callback plugin.I've bumped into this when implementing a default label feature and ended up doing it this way:
ara/ara/plugins/callback/ara_default.py
Lines 222 to 230 in a8b8ff5
We should probably go back and do something similar for the other Ansible callback settings.
It could be good. I've been testing with the following format and it seems to work for ENV variables and the ansible.cfg .ini method.
With the latest master, if I use the following syntax.
[ara]
api_client = http
api_server = http://ara.coldnorthadmin.com:8000
default_labels = '["dev","deploy"]'
ignored_facts = '["ansible_env", "ansible_all_ipv4_addresses"]'
ignored_arguments = '["extra_vars", "vault_password_files"]'
I end up with
And the ignored_facts and ignored_arguments do not match anything and the output is saved on the api-server.
That said, if I use the following syntax for all the arguments, it works as either the .ini or the ENV variables.
#ini
ignored_facts = ansible_env,ansi
ble_all_ipv4_addresses
ignored_arguments = extra_vars,vault_password_files
ignored_files = install_prometheus_server.yml
#ENV
export ARA_IGNORED_FACTS=ansible_env,ansible_all_ipv4_addresses
export ARA_IGNORED_ARGUMENTS=extra_vars,vault_password_files
export ARA_IGNORED_FILES=install_prometheus_server.yml
It seems a bit more clean than whatever is being parsed by Dynaconf.
@LaurentDumont yeah, in your case default_labels
was parsed as a string which was then split into a list delimited by ,
so you ended up square brackets and double quotes.
There must have been a good reason why we documented the format with square brackets and double quotes (format consistent with dynaconf?) but I can't think of anything right now. If it works without, we should definitely document it that way.
I sent a patch to update the docs: https://review.opendev.org/705115
Do you think your patch for the file filtering feature is good to go ? We can add docs and tests in different patches, it's fine.
@LaurentDumont yeah, in your case
default_labels
was parsed as a string which was then split into a list delimited by,
so you ended up square brackets and double quotes.There must have been a good reason why we documented the format with square brackets and double quotes (format consistent with dynaconf?) but I can't think of anything right now. If it works without, we should definitely document it that way.
I sent a patch to update the docs: https://review.opendev.org/705115
Do you think your patch for the file filtering feature is good to go ? We can add docs and tests in different patches, it's fine.
Sounds good to me. Ready to merge!