Add stat deletion/pruning capability
Opened this issue · 6 comments
Two use cases to address here.
- Pruning stats older than a certain amount of time and/or pruning from them between a time frame. Possible examples:
- http://cassabon.foo.com/metrics/delete/?target=foo.path.bar&until=-14d
- http://cassabon.foo.com/metrics/delete/?target=foo.path.bar&from=-21d&to=-7d
- Deleting stats entirely from the redis index and the Cassandra database (since deleting the index would effectively orphan the data), which would go via the path route. Example:
Another option for implementation would be to write a companion utility CLI application, so that way there could be feedback of what would be deleted along with required confirmation, rather than implementing this through the web API. It'd read the cassabon.yaml config to figure out what the rollup databases are and get the connection info to Cassandra and Redis. This may actually be the better option rather than having deletes with no confirmation via web API.
A third implementation option is actually having a web GUI to display the information a companion CLI would.
I'm leaning toward option two or three for implementation, because I'd want feedback to make sure I didn't screw something up, especially if I were to do a wildcard delete.
If we implement deletes via API, we could add a dry-run parameter to the request. Also, the response (dry-run and for-real) could provide statistics on what it would/did touch. Does this bring the API back into equal contention?
Given the following data, what is eligible for deletion, and what is the cascade behavior?
127.0.0.1:6379[5]> zrangebylex cassabon_dev - +
1) "0001:foo:false"
2) "0002:foo.bar:false"
3) "0003:foo.bar.baz:false"
4) "0004:foo.bar.baz.average:true"
5) "0004:foo.bar.baz.count:true"
6) "0004:foo.bar.baz.max:true"
7) "0004:foo.bar.baz.min:true"
8) "0004:foo.bar.baz.sum:true"
Deleting "foo.bar.baz.count" obviously should succeed.
What should happen if you delete "foo.bar.baz"? Should it fail, or should it delete all child paths? As there would then be no leaves that start with "foo" or "foo.bar", should it delete those, too?
Yeah, if there's a dry run flag, that brings the API back into contention.
If there's no wild card, only leaves (paths ending in :true) rather than stems should be able to be deleted.
For a wild-card or branch deletion, we should decide how greedy an ending wildcard should be, or if we should just have a parameter that signals deleting the entire branch and its contents.
Do we need wildcards at all? For example, given the earlier example paths, is there (or should there be) any difference between the following?
DELETE foo.bar.baz.*
DELETE foo.bar.baz
While it could be possible to use leading or middle wildcards (.bar, foo..baz), is there a reasonable use case for this?
For metrics deletes, a leading wildcard probably requires a full table scan of every table that contains the matched paths.
Cassandra is lame, there are no range deletes. Suggestion: do a query, and then delete every row returned. Not gonna happen.
This means that we offer deleting EVERY ROW for a path, or not deleting anything. :(
I'm open to suggestions.
{
"dryrun":false,
"approximate_total_deleted":2,
"approximate_total_bytable":{
"rollup_000001800":0,
"rollup_000003600":0,
"rollup_000086400":0,
"rollup_001814400":0,
"rollup_002592000":2,
"rollup_031536000":0
},
"delete_errors":{
"rollup_000001800":"Invalid operator \u003e= for PRIMARY KEY part time",
"rollup_000003600":"Invalid operator \u003e= for PRIMARY KEY part time",
"rollup_000086400":"Invalid operator \u003e= for PRIMARY KEY part time",
"rollup_001814400":"Invalid operator \u003e= for PRIMARY KEY part time",
"rollup_002592000":"Invalid operator \u003e= for PRIMARY KEY part time",
"rollup_031536000":"Invalid operator \u003e= for PRIMARY KEY part time"
}
}
Metrics deletion is mostly in place, pending Cassandra ability to delete by range.
For Index deletions:
Enumerate all paths that match the wildcard. For each path:
- delete all metrics data
- remove entry from internal rollup data structure
- remove the path from ElasticSearch
It's important to clear the internal data structure, so that if that path arrives again in the input, it will be a cache miss and be re-indexed.