SEMICeu/LinkedDataEventStreams

Which members should be deleted with version-based retention policies

woutslabbinck opened this issue · 1 comments

With time-based retention policies, a tree-path is given and a duration. Based on those 2 triples, it is possible to reason which members should be deleted when a retention policy is present.

For version-based retention policies however, this is not clear. The only information given in this policy is an amount x and a versionKey, which states the collection of which a minimum of x should be preserved.

When that amount is reached, it is impossible to reason which members should be deleted. As it is not favorable to have random deletion, it might be interesting to delete the oldest members of the collection.

I suggest that, just like in time-based retention policies, deterministic deletion behavior can be achieved by adding a tree-path, which leads to a timestamp of a member. Based on that timestamp, the oldest member of the collection can be pruned due to the retention policy.

A design was added in #21 to circumvent this. However, we will not enforce this as a MUST: your back-end system can still be smart enough to know for itself what the “last” members are, without necessarily making this known to the outside world.

If however we also want to decentralize the retention policy enforcer, we however do need to metadata proposed in #21