openvstorage/openvstorage-health-check

Arakoon collapse check triggers too fast

jtorreke opened this issue · 3 comments

The manual arakoon collapse script does not collapse when there are less than 10 tlx files. Collapsing is triggered every 24 hours. As soon as the threshold of 10 tlx files is reached, HC will trigger an alert saying there are too many tlogs and collapsing is not running properly as the oldest tlx is +3 days old (the cluster I've seen this on only generates 1 tlx per day, so the oldest tlx was 10 days old).

Can you raise the ceiling for this to 15 tlx files instead of 10 - which is identical to the threshold configured in the manual arakoon collapse script.

We would better allow arguments to be passed within the cli. This way any customer can finetune it

Fixed by #447
Packaged in: openvstorage-health-check_3.7.0-dev.1525250711.8ba3b6b-1_amd64.deb

Usage:

Usage: healthcheck_cli.py arakoon collapse-test [OPTIONS]

  Verifies collapsing has occurred for all Arakoons

Options:
  --max-collapse-age INTEGER  Maximum age in days for TLX
  --min-tlx-amount INTEGER    Minimum amount of TLX files before testing
  --help                      Show this message and exit.
root@jef-node01:~#

Released in https://github.com/openvstorage/openvstorage-health-check/releases/tag/3.7.0
Packaged in openvstorage-health-check_3.7.0-1_amd64.deb