Gateway should cache schemas
Closed this issue · 4 comments
The gateway queries downstream systems every time a schema is requested. I think it should have an option to cache schemas. Users could then request a schema with ?ttl=millis or some equivalent. If the schema isn't yet cached or exceeds the TTL then the schema is re-fetched, cached, and returned. If it meets TTL requirements, the cached schema is returned. If no TTL is set, "latest" is assumed, and the schema is always fetched.
/cc @cpard
@criccomini I'm a little worried about the schema caching here. What if the requester ends up using invalid schema information because an update happened in the source and the cache wasn't invalidated on time?
My feeling is that caching is ok for stuff like statistics but when it comes to the schema, e.g. a column was dropped or a type was changed, there are cases where using the wrong data might cause real problems. WDYT?
SGTM. I thought you were requesting caching during our last call. If that's not the case, I'll drop this.
SGTM. I thought you were requesting caching during our last call. If that's not the case, I'll drop this.
Sorry for that @criccomini, what I wanted to say during the call is that depending on the use case of the gateway, caching might be a viable solution or not. From my PoV, it's not that important at least considering the implications of managing invalidation consistently enough.
K, sounds good. I agree on the inconsistency part. Plus the (forthcoming) /registry
path will act as a schema store, which could be used as a cache if needed. Plus^2, adding in-memory caching in the future is pretty easy to do as well.