henryr/cap-faq

Discussion of CAP consistency and Application-level consistency

pbailis opened this issue · 1 comments

As I mentioned in #3, I think there are two major causes of confusion when it comes to "beating CAP." Number two is the difference between CAP "consistency" and application-level "consistency."

While linearizable registers can be used to guarantee almost any application-level integrity constraint, it's not always required. For example, if we're trying to generate unique IDs for a client, if we opt for a sequential ID generation scheme, then we're going to have to be unavailable. However, if we generate IDs using a clientID and a logical clock, then we can guarantee uniqueness without coordination. This example assumes client IDs are unique, but there are other useful "consistency" properties we can achieve with HA, like ensuring that fields in a database are non-NULL.

What's going on here is that, at a high level, for restricted data types and operations, it is possible to satisfy integrity constraints without coordination and giving up HA. This difference between read/write linearizability and application-level consistency appears to be a source of confusion/contention (e.g., explicitly windowed queries over immutable data). This isn't limited to linearizability: the same debate crops up in conflict serializable databases (e.g., weak isolation models).

Unfortunately, I don't think it's well-understood at this point which application-level integrity constraints are achievable with HA; we've been thinking about this a bit, but I don't have anything concrete enough to add to the FAQ at this point. However, I think it's possible to tie this issue up neatly by stressing that CAP pertains only to read-write registers. That is, application-level integrity constraints are not addressed by the Gilbert and Lynch result. To be complete, the FAQ might mention that some constraints are achievable, but not all are. I think that's the limit of what we know today. This might go in "3. What is 'read-write storage'?" but could also go in the "common fallacies" section.

This is similar to the "What's the relationship between CAP and ACID?" question, but only for CAP "C" and ACID "C," not CAP "C" and ACID {"A", "I", and "D"}. I think it's worth a separate discussion.

I haven't forgotten about this issue, and I feel bad for having left it open so long. I'm going to attempt some answers in my RICON talk in October and will hopefully have a concise explanation after the talk.