Do not allow two rollups to share a table
Closed this issue · 6 comments
Rollup data for a path is written to a table whose name is determined form the retention period. This means that for any rollup definition such as the following, no two entries can have exactly the same retention period:
rollups:
default:
retention:
- 6s:6h
- 1m:7d
- 1h:30d <<< This goes into the 30-day table
- 6h:30d <<< So does this, rendering the data useless
- 6h:365d
aggregation: average
Proposed solution: Flag a warning when loading the configuration, and drop the definition (as we do for other such errors).
Which definition do we drop?
I'd rather call it a config loading error and go fatal with that reason.
We don't do any fatal errors during rollup config parsing. So far, we check (and skip) the following:
G.Log.System.LogWarn("Invalid aggregation method for \"%s\": %s", expression, v.Aggregation)
G.Log.System.LogWarn("Malformed regular expression for \"%s\": %s", expression, err.Error())
G.Log.System.LogWarn("Malformed definition for \"%s\": %s", expression, s)
G.Log.System.LogWarn("Malformed window for \"%s\": %s %s", expression, s, couplet[0])
G.Log.System.LogWarn("Duration less than minimum 1 second for \"%s\": %v", expression, window)
G.Log.System.LogWarn("Malformed retention for \"%s\": %s %s", expression, s, couplet[1])
G.Log.System.LogWarn(
"Next duration is not a multiple for \"%s\": %v %% %v remainder is %v",
expression, v.Window, shortestDuration, remainder)
G.Log.System.LogWarn("Rollup expression rejected due to previous errors: \"%s\"", expression)
These errors go to the log, which unfortunately we probably aren't monitoring.
I suppose we could set a flag if anything gets rejected during the parse, and then at the end issue a fatal error; that way it's impossible to ignore a bad config and send data to "default" when it shouldn't go there.
Well, we should definitely die if default is bad or undefined.
I suppose we could have a "strict" option to die if non-default paths are malformed, and go on letting stats fall through to default.
That might be a user feedback thing once people besides us are using it.
"Strict" option is a good idea, probably defaulting to "on".
In non-strict, we could synthesize a default "default" so that Cassabon works no matter how badly the config is hosed.
Works for me.
10s:1h
1m:30d
...seems reasonable for a default.