Watcher does not log errors
Closed this issue · 1 comments
candlerb commented
From /var/log/ganeti/watcher.log
...
05/25/2014 12:08:32 PM Checking host89.ws.nsrc.org (job: 2171)
05/25/2014 12:08:32 PM host89.ws.nsrc.org (job: 2171) done. Status: success
05/25/2014 12:08:32 PM Mailing brian@nsrc.org about host89.ws.nsrc.org
05/25/2014 12:08:32 PM Handling host89.ws.nsrc.org (job: 2171)
05/25/2014 12:08:47 PM Checking host89.ws.nsrc.org (job: 2171)
05/25/2014 12:08:47 PM host89.ws.nsrc.org (job: 2171) done. Status: success
05/25/2014 12:08:47 PM Mailing brian@nsrc.org about host89.ws.nsrc.org
05/25/2014 12:08:47 PM Handling host89.ws.nsrc.org (job: 2171)
05/25/2014 12:09:02 PM Checking host89.ws.nsrc.org (job: 2171)
05/25/2014 12:09:02 PM host89.ws.nsrc.org (job: 2171) done. Status: success
05/25/2014 12:09:02 PM Mailing brian@nsrc.org about host89.ws.nsrc.org
05/25/2014 12:09:02 PM Handling host89.ws.nsrc.org (job: 2171)
05/25/2014 12:09:17 PM Checking host89.ws.nsrc.org (job: 2171)
05/25/2014 12:09:17 PM host89.ws.nsrc.org (job: 2171) done. Status: success
05/25/2014 12:09:17 PM Mailing brian@nsrc.org about host89.ws.nsrc.org
05/25/2014 12:09:17 PM Handling host89.ws.nsrc.org (job: 2171)
05/25/2014 12:09:32 PM Checking host89.ws.nsrc.org (job: 2171)
05/25/2014 12:09:32 PM host89.ws.nsrc.org (job: 2171) done. Status: success
05/25/2014 12:09:32 PM Mailing brian@nsrc.org about host89.ws.nsrc.org
05/25/2014 12:09:32 PM Handling host89.ws.nsrc.org (job: 2171)
05/25/2014 12:09:47 PM Checking host89.ws.nsrc.org (job: 2171)
05/25/2014 12:09:47 PM host89.ws.nsrc.org (job: 2171) done. Status: success
05/25/2014 12:09:47 PM Mailing brian@nsrc.org about host89.ws.nsrc.org
05/25/2014 12:09:47 PM Handling host89.ws.nsrc.org (job: 2171)
05/25/2014 12:10:02 PM Checking host89.ws.nsrc.org (job: 2171)
05/25/2014 12:10:02 PM host89.ws.nsrc.org (job: 2171) done. Status: success
05/25/2014 12:10:02 PM Mailing brian@nsrc.org about host89.ws.nsrc.org
05/25/2014 12:10:02 PM Job 4 reserved 31 (> 30) times, burying
I am guessing an exception has occurred because it doesn't say "Mailing managers about ...", but if so the exception has been silently caught, and I can't see in the code where this happens. I think it should be caught and logged
Note: this particular error was caused by no MTA being installed on the host where ganetimgr was running, and installing an MTA fixed the issue, but there were no clues in the log.
05/25/2014 01:13:49 PM Handling lock key cluster:nuc:instance:host88.ws.nsrc.org:lock (job 2203)
05/25/2014 01:13:49 PM Handling host88.ws.nsrc.org (job: 2203)
05/25/2014 01:14:04 PM Checking host88.ws.nsrc.org (job: 2203)
05/25/2014 01:14:13 PM Job 2203 finished, removing lock cluster:nuc:instance:host88.ws.nsrc.org:lock
05/25/2014 01:14:19 PM Checking host88.ws.nsrc.org (job: 2203)
05/25/2014 01:14:19 PM host88.ws.nsrc.org (job: 2203) done. Status: success
05/25/2014 01:14:19 PM Mailing brian@nsrc.org about host88.ws.nsrc.org
05/25/2014 01:14:19 PM Mailing managers about host88.ws.nsrc.org