get-disk-safety failed with 'Namespace manager exception: Nsm_model.Err.Namespace_id_not_found'.
Closed this issue · 4 comments
JeffreyDevloo commented
Problem description
Monitoring revealed the following sub awesome behavior with the get disk safety command:
CRIT - EXCEPTION HC000 - Could not fetch alba information for backend nvmebackend Message: Command 'get-disk-safety' failed with 'Namespace manager exception: Nsm_model.Err.Namespace_id_not_found'.
What could have happened:
- Another healtcheck was busy with a test that involves creating and removing namespaces
- At the time the other namespace was getting deleted, get-disk-safety was called
Proposed solution
The whole command should not fail when one namespace cannot be fetched. Perhaps return the current output you have collected and add an exception section or something?
domsj commented
Which alba version?
I think this is something that should be fixed in 1.3.7, see #633 .
JeffreyDevloo commented
The alba version is 1.3.7
root@ovs05:~# alba version
1.3.7
git_revision: "tags/1.3.7-0-gfb75d47"
git_repo: "Not available"
compile_time: "09/03/2017 12:41:35 UTC"
machine: "51ce1efbe55d 4.4.0-36-generic x86_64 x86_64 x86_64 GNU/Linux"
model_name: "Intel Xeon CPU E31220 @ 3.10GHz"
compiler_version: "4.03.0"
dependencies:
jeroenmaelbrancke commented
On OVH we forgot to restart the proxies after updating to 1.3.7.
Closing the tickets and if i still see this issue i will open the tickets again.
jeroenmaelbrancke commented
Problem is not solved. This night we received an urgent with the same error.
Alba version 1.3.8