failure in hook should not prevent next hooks to execute
Opened this issue · 17 comments
Operating system
Ubuntu 20.04 LTS
Helmfile Version
v0.160.0
Helm Version
v3.13.2
Bug description
When a hook fails, all next hooks should execute. However, the failure should be propagated to the .Event.Error
, so that hooks can decide what to do when an error happened.
I believe that's exactly the point behind .Event.Error
.
Example helmfile.yaml
releases:
- name: raw
chart: https://github.com/dysnix/charts/releases/download/raw-v0.3.2/raw-v0.3.2.tgz
wait: true
hooks:
- name: first-hook
events:
- postsync
showlogs: true
command: "false"
- name: second-hook
events:
- postsync
showlogs: true
command: "echo"
args:
- "this should execute, but it does not"
Error message you've seen (if any)
❯ helmfile sync
Upgrading release=raw, chart=https://github.com/dysnix/charts/releases/download/raw-v0.3.2/raw-v0.3.2.tgz
Release "raw" does not exist. Installing it now.
NAME: raw
LAST DEPLOYED: Tue Dec 26 10:21:27 2023
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
Listing releases matching ^raw$
raw default 1 2023-12-26 10:21:27.258526025 -0300 -03 deployed raw-v0.3.2
hook[postsync] logs |
UPDATED RELEASES:
NAME CHART VERSION DURATION
raw https://github.com/dysnix/charts/releases/download/raw-v0.3.2/raw-v0.3.2.tgz v0.3.2 2s
in ./helmfile.yaml: failed processing release raw: hook[first-hook]: command `false` failed: command "/usr/bin/false" exited with non-zero status:
PATH:
/usr/bin/false
ARGS:
0: false (5 bytes)
ERROR:
exit status 1
EXIT STATUS
1
Steps to reproduce
See helmfile.yaml above
Working Helmfile Version
N/A
Relevant discussion
No response
@felipecrs In Helmfile, each hook is a self-contained entity, and there is no communication between multiple hooks. If you need to handle errors, it should be done within a single hook. Are there any other real-world scenarios where errors need to be passed between hooks?
In my case, yes.
For all releases I have in my hemfile, I have a presync and a postsync hook that starts collecting some logs (pre) and finishes (post).
Something like:
releases:
- name: raw
chart: https://github.com/dysnix/charts/releases/download/raw-v0.3.2/raw-v0.3.2.tgz
wait: true
hooks:
- name: start-logs-capture
events:
- presync
command: ./start_logs_capture.sh
- name: check-service
events:
- postsync
showlogs: true
command: "echo"
args:
- "this commands checks if service is up, fails otherwise"
- name: stop-logs-capture
events:
- postsync
command: ./stop_logs_capture.sh
And multiply that pattern to a lot of releases. I even have template releases that automatically adds such start-logs-capture and stop-logs-capture hook to all releases in my helmfile.
However, as you can probably tell, if the "check-service" command fails, the "stop-logs-capture" hook would never execute. And that's not good. :(
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I still think this is valid.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Still alive, I can share my workaround if someone wants to know. But it's a little complicated.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Not stale
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
@felipecrs mind sharing your workaround?
Oh. Yes. It's kinda cumbersome. I have a trap EXIT in all hooks and ensure to write /tmp/failure and exit 0.
Then, I make sure the last hook always checks for /tmp/failure and if present exit 1.
I use helmfile in a container so /tmp is isolated.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Still valid.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.