bbc/http-transport

RFC: Creating unique cache keys

Closed this issue · 2 comments

RFC: Creating unique cache keys

After having a meeting with my team, I have created this RFC to get some feedback/thoughts from the teams before making any decisions.

Issue:

We need to check if cached data is available in Redis on request. Currently, http-transport-cache creates cache keys based on the request method and url and then retrives/stores in Redis.

https://github.com/bbc/http-transport-cache/blob/9c3ed32965f19c275febb3be40a0756c9d943324/lib/maxAge.js#L36

https://github.com/bbc/http-transport-cache/blob/9c3ed32965f19c275febb3be40a0756c9d943324/lib/maxAge.js#L54

However, in some cases we make multiple requests to the same URL but vary on headers (pass in same header key with different values depending on the context). In this situation, the responses differ but cache key stay the same therefore values get overwritten multiple times.

Proposed solutions:

Want we want to propose a similar solution to Flashheart v2 but instead of blacklisting headers, whitelist them.

In other words, we pass in an array of headers that we need to vary on and construct a unique cache key.

Example (option A - not touching http-transport-cache):

We pass in an array of headers we vary on from the client.

const clientParams = {
  name: 'client',
  logger: loggerInstance,
  stats: statsInstance,
  externalCache: { cache: cacheInstance, timeout: cacheTimeout },
  circuitbreaker: {
    maxFailures,
    resetTimeout
  },
  timeout: 2000,
  userAgent: 'user agent',
  retries: 3,
  retryDelay: 500,
  httpClient: httpClientInstance,
  varyOn: ['accept-language', 'some-other-header']
};

We then add varyOn(header) method to core client (https://github.com/bbc/http-transport/blob/master/lib/client.js) and configure it (https://github.com/bbc/flashheart/blob/7c04eec960ba7769743ecb6eabc3b15eb4902268/src/httpTransport/configuration.ts#L139) by chaining varyOn(header).

// core client
varyOn(header) {
  this._ctx.req.varyOn(header);
  return this;
}

// configuration
const builder = httpTransport
    .createBuilder(transport)
    .retries(getRetries(params))
    .retryDelay(getRetryDelay(params))
    .varyOn(params.varyOn)
    .use(toError());

This will make the header(s) that we vary on available in the Request class (https://github.com/bbc/http-transport/blob/master/lib/request.js). Request can then check if varyOn header(s) exist(s) in the actual headers. This will allow us to construct unique request keys using the url, request method and varyOn array.

// add inside the constructor
this._varyOn;

// add new method
getVaryOn(header) {
  this._varyOn = header;
  return this;
}

// update, something like this
getRequestKey() {
  const headers = this.getHeaders();
  const varyOnHeaders = this.getVaryOn();
  let varyOnKeyValue = [];

  for (let varyOnHeader of varyOnHeaders) { 
    if (headers[varyOnHeader]) {
      varyOnKeyValue.push(`${varyOnHeader}=${headers[varyOnHeader]}`);
    }
  }

  return `${this.getMethod()}:${this.getUrl()}:${varyOnKeyValue.join(':')}`;
}

Resulting key:

<method>:<url>:accept-language=en,some-other-header=some-value

Example (option B - passing value to http-transport-cache):

We could also extract the key creation logic into the http-transport-cache.

The parameters that we pass in from the client are the same as the above. Instead of passing varyOn to the request we pass to external and internal caches.

// configuration
function configureExternalCache(builder, params) {
  // code removed

  const cacheOpts = {
      name: params.name,
      varyOn: params.varyOn
  };
  
  // code removed
  
  builder.use(http_transport_cache_1.maxAge(cache, cacheOpts));
  builder.use(http_transport_cache_1.staleIfError(cache, cacheOpts));

  // code removed
}

http-transport-cache (https://github.com/bbc/http-transport-cache/blob/master/lib/cache.js#L11) can then hold the logic for cache key creation (same as above compare headers with varyOn value, etc.)

Extending functionality

The above examples and suggestions are based on passing varyOn at the global level. What would be also nice is to allow client to override at the request level. We can achieve this by passing varyOn array as one of the request options.

I believe we would then need to add a check somewhere here to make the varyOn available on Request.

Considerations

Additionally, something to be aware of is the length of cache keys. There is a good article that has some benchmarks. In short, there isn't a huge difference but the longer the key length the slower the runs are.

Questions

  • Do we need getRequestKey() to exist in the Request and can we move this logic out into http-transport-cache?
  • Any other ideas of how we can pull this off?

Conclusion

This will allow us and other teams to pass header key-value pairs they need to vary on and create unique cache keys.

Comments and suggestions are welcome.

Thank you.

In my opinion the caching logic should reside entirely within the http-transport-cache, since the approach of the http-transport is to be modular, it make more sense to have the getRequestKey moved in the cache module.

Another observation is around the key creation, I think it would be beneficial to decouple the length of the key from the size of the input by using a hashing mechanism. Doing so we prevent the key length to grow and have a consistent mechanism. Said that, it would be worth doing a test of the hash generation process to assess if it affecta significantly the performance of the cache module or not.

Closing this issue as we've add this feature here bbc/http-transport-cache#27. Thanks.