puppetlabs/trapperkeeper

Allow Service Protocols to be Partially Implemented

aperiodic opened this issue · 6 comments

Clojure allows protocols to be partially implemented: you can reify or extend to a protocol without implementing all of the protocol's methods. This allows for backwards-compatible additions to the protocol, since a new method can be added to the protocol without having to update all implementations in lockstep.

TrapperKeeper currently requires services to implement every method of the service protocol that they specify. This makes it very difficult to add new methods to a protocol if the protocol has more than one implementation, since the definition and all implementations must be updated in lockstep, which is not always possible. Additionally, this behavior is counterintuitive given that Clojure's protocols are the basis for TK services and protocols allow for partial implementation.

I'd like TrapperKeeper to allow services to partially implement their protocols, to allow backwards-compatible additions to be made to service protocols when those protocols have several implementations in different repositories, and to match the semantics of the clojure protocols that TK services are based on.

@aperiodic might be better to have this on Jira. Also, what is your suggestion for how a consumer would know whether or not it was safe to call a protocol function on a service if there is no guarantee that they are implemented?

If memory serves, I believe this was a conscious design decision. I see a lot of value in enforcing services to implement all functions in a protocol, for the reason @cprice404 pointed out - it enables consumers to know that a contract is honored. I understand the cognitive dissonance with the usual behavior of protocols but, OTOH, the fact that a protocol can normally be partially implemented is not without its own shortcomings.

I'm tentatively 👎 on this proposal, but interested in further discussion ... if you're adding a function a protocol, wouldn't you really want to force all implementations which upgrade their dependency to the new version to also implement the new function? I mean, I guess your answer will be "no" 😈 but ... maybe there is an example/use-case you could provide which would illustrate why this is so bothersome?

@cprice404 I'm interpreting your question as implying that a compile-time check that a service provides some implementation for a protocol method is a sufficient guarantee that you can call that method and everything will be fine. I don't agree: I think that if you depend on a service then you should test against it to make sure its behavior matches your expectations, especially in a dynamically-typed language where protocols don't convey any information about the types of their arguments. You should at least have a smoke test that when your application hands the service you're depending on what you think it wants, it gives you back something within your expectations. Without that, then even if you have a guarantee that the service will do something when you call a protocol method, you can't have any confidence that your application will function as expected. If you do have that smoke test, then the compile-time check that all protocol methods are implemented doesn't give you anything[1]. Therefore, I think the value in enforcing that some implementation be provided for every method is extremely minor. So, to answer your question, I'd say "test your assumptions about the behavior of the services you're depending on". That gives you a much stronger guarantee than a static check that some form exists.

@KevinCorcoran I would like to "force" all implementations to immediately implement the new functionality, but that's not always feasible. One of the promises of TrapperKeeper was to be able to have multiple implementations of a service that could be swapped-out without any code changes. To me this implies that some of those implementations could come from completely different jars that are shipped independently.

One could imagine having a particularly popular TK service--say, a generic KV data storage service--for which people you don't even know have written implementations. When you have this situation where there are implementations that ship separately, that you may not even have control over, then TK's restriction that all protocol methods must be implemented means you can only have two kinds of semver-compatible releases: bugfixes, and major releases, since any new TK methods will break all of the other implementations that ship separately.

In the example of a generic KV data storage service, let's say you start out with just get and put, and then decide that you want to add an update function that can take a function to atomically change a value. Maybe some implementations of your service use a storage engine without strong enough consistency guarantees to be able to implement this. Since TK requires services to fully implement the protocol, those implementations are now stuck at the original version of your protocol. If you add more functions to the protocol that they could support, they still won't be able to implement them, since they can't implement update. Or worse, they pretend to implement them, but throw some unpredictable error when actually called.

This puts users of your service in a tough spot. Let's say they were using one implementation of your service, the Fast and Loose implementation, in development, and another, the Slow and Strict, in production. The Fast and Loose implementation can't support update, but Slow and Strict does. What the user would like is for their application to check at runtime if the KV storage service implementation supports update and change its behavior accordingly. If TK allowed partial service definitions, then it would be possible for the application to probe the implementation at startup, by calling update and seeing whether an AbstractMethodError is thrown (or by calling a TK-provided function that does the same thing). But with TK's current behavior, this situation is intractable. Either Fast and Loose doesn't implement the new method, so once their application depends on the new KV storage service protocol then Fast and Loose won't even compile, or Fast and Loose has pretended to support that operation but throws some other error when it's called (in which case there's no reliable way to tell in general if an implementation supports a particular method), or Fast and Loose has hacked something together that mostly works but does something different the application developers would like their application to fail if update can't be supported. I think that this really limits the potential of TrapperKeeper.

We're in a similar situation with the RBAC Consumer Service. The protocol for the RBAC Consumer Service is defined in the clj-rbac-client repository, along with an implementation that talks to a remote RBAC service over HTTPS. The main RBAC repository depends on the clj-rbac-client to get that protocol definition and provides an implementation of it that uses a local RBAC service via the TK protocol. This allows applications to use RBAC without having to depend on RBAC itself. But, the way testing works right now, it's not possible for us to add a new protocol method to the Consumer Service and then add it to RBAC and promote both of them into console services and then PE without breaking a test pipeline somewhere along the way. If TK let services be partially implemented, then we could promote the new version of the protocol into console services to have it there for RBAC's test pipeline, without breaking console services because TK refuses to compile the RBAC implementation that doesn't have all the methods once the new protocol is promoted.


[1] Note that there is a big shortcoming in this check today: it doesn't provide any guarantee that the service implements all the arities of a multi-arity protocol method. So, this compiles:

(defprotocol Foo
  (bar
    [this]
    [this baz]))

(defservice foo-service
  Foo
  []
  (bar [this] (println "what's baz?")))

but if anybody tried to call the bar protocol method with the baz argument then it'd blow up in an AbstractMethodError because foo-service doesn't implement that arity.

@aperiodic I feel like you are contradicting yourself here when you say that "if you depend on a service you should test against it", and then a short while later you say "One could imagine having a particularly popular TK service ... for which people you don't even know have written implementations." The latter was definitely one of the design goals, and in that model, it's certainly not possible to test against all implementations of a service.

Given that, if we have no compile-time guarantees that they implement the service protocol functions, then it seems like every consumer of every service would need to make a test call to every function of the upstream service at startup to check for its existence (and behavior)? At that point it sounds like you're implementing your test suite inside of your production code at startup. And what if some of those functions can't be called without side effects?

Not to mention that expecting consumers to probe functions and catch AbstractMethodError as part of the API for interacting with code seems extremely undesirable.

My expectation for solving problems around service protocols that became popular enough to support multiple implementations of, when it became challenging to manage the versioning of them, has been that one would separate the protocol itself out into a separate maven artifact. Then the protocol could be versioned independently from the implementations. This is where we were leaning with the WebServerService protocol when we were planning on supporting both Jetty7 and Jetty9 and I still think we might end up doing that at some point in order to facilitate the ability to provide a non-Jetty implementation of the service. I think this pattern would resolve the CI breakage issue you are describing in your project.

I agree with you that it would be nice to have better ways to correlate versions of protocols with the service implementations, but I think the way to do that would probably be to add some sort of versioning to TK itself, so that when you expressed a dependency on a service, you could also specify the version of the protocol. That starts to get pretty complex though and doesn't seem like something we are likely to tackle until there is a pretty strong forcing function.

I can also see a case to be made that there ought to be a way to mark some service functions as optional; e.g. in your update case above, where there is an alternate way that a consumer could achieve roughly the same behavior. But for many (most?) services that we write, they only have two or three functions and all of them are critical to the utility of the service.

One last thought here: there is nothing precluding you from baking some of the behavior that you are describing into your service itself. In your KV store example, you could have a service protocol function called supports-update? which could be implemented to return information to the consumer about whether or not it's safe to call update on that implementation. This seems like a vastly superior API as compared to expecting the consumer to try to call update (which probably has side effects) and catch an exception.


w/rt your mention of the multi-arity shortcoming in the current compile-time check: I wasn't aware of that and I'd consider that a bug. I'll open a Jira ticket for it, thanks for the heads up.

Created TK-386 to capture the bug in the validation for multi-arity fns.

In general, I agree with @cprice404's take on this. In particular, I think that catching AbstractMethodError would be a really bad pattern to establish.

I hope versioning the protocol separately from the implementation(s) will solve the problem you're currently facing. I think that we probably would have already adopted such a pattern elsewhere, if not for the overhead. And it seems like the correct solution to the kind of problem you've described, instead of making protocol functions optional with the goal of allowing service implementations to upgrade to newer protocol versions and expecting things to Just Work.

The idea to make it possible to mark some functions in a service interface as optional is an interesting one, and in the abstract I think it would be a good addition. However, I think that it would require us to move our service interfaces away from Clojure protocols and to something else that allowed us to define a way to mark a function as optional.