jpodwys/superagent-cache

Feature request: limit the total number of cached items

Opened this issue · 11 comments

Using whatever is the simplest eviction algorithm to implement (fifo?).

This sounds simple enough to implement and seems like it should only add a trivial amount of code to the project. However, this addition can't be made in this repository. superagent-cache uses cache-service-cache-module to store data by default (I own both repos). I'd prefer we move this issue to that repo. Let's talk implementation there.

sure, feel free to move the issue.

Guess the simplest eviction policy would be fifo: if the limit is 10 cached requests, then the first 10 requests get cached. When the 11th request is made the first cached request gets removed.

This would be trivial to implement with a hashmap that maintains the insertion order (LinkedHashMap in java https://docs.oracle.com/javase/8/docs/api/java/util/LinkedHashMap.html).

The motivation is to prevent running out of memory if the cache grows too fast.

any ideas how you'd like this to be implemented? Perhaps I could submit a PR..

First thing: are you using this in the browser or in node? If you're using it in node, I'd recommend considering using superagent-cache with cache-service-redis or cache-service-memcached so you have a distributed store as well as robust eviction policy features. There are plenty of redis hosts with a free tier like Redis Cloud (probably true with memcached as well).

However, if you're using this in the browser, or just want this feature available in cache-service-cache-module, read on.

I looked up JavaScript implementations of LinkedHashMap and there are some out there. I'm open to the idea of you taking one of those and replacing cache-service-cache-module's cache.db object with an instance of LinkedHashMap.

I did notice a complication with cache-service-cache-module while looking into implementing this last week. Currently, I'm expiring keys lazily--only when a key is requested do I check whether it should be expired. Obviously this would need to change to facilitate a good key count limiter.

Let me know what you decide to do. If you decide you want to submit a PR, we can talk in more detail over on the cache-service-cache-module repo.

for now just in memory, i'm not sure adding redis and increasing the operational complexity would be justified in my case, but i'll think again about it.

Think I found a fairly simple solution: if you replace cache.db with LruMap or LfuMap (guess it stands for least frequently used) https://github.com/montagejs/collections/blob/master/test/spec/lfu-map-spec.js, than that should be pretty much all, LfuMap will take care of evicting stale entries so you don't need to bother with that.

That sounds like a simple solution. One potential deal-breaker here is that I want cache-service-cache-module to stay tiny. I use this library in my own personal development so I have a vested interest in keeping it small. If we can do this with only a trivial increase in download size, then I'm on board.

If we can't, then I'm open to the idea of making the library's cache.db object configurable so that the default is tiny and you're able to swap it out for something more robust as needed. AFAIK, this is the best way to give everyone what they want while keeping the library tiny.

As a side not on the redis solution, if you run multiple instances of your node app, or anticipate doing so soon, then the operational complexity to which you referred is necessary. Regardless, I appreciate that you understand your app's needs far better than I do so I won't mention that option again unless you have a question about it.

sounds reasonable. i'd be perfectly happy if cache.db can be changed so i can plug in LruMap. Or you just copy the required files from https://github.com/montagejs/collections/ into cache-service-cache-module?

please correct me if i'm wrong, but seems like the redis store cannot be configured to limit the number of cached items?

I believe you're correct, but you can get around that with redis's eviction policy settings. For example, if you were to set redis's eviction policy to volatile-lru, then redis would

evict keys by trying to remove the less recently used (LRU) keys first, but only among keys that have an expire set, in order to make space for the new data added.

If I understand correctly, then your memory limit would be as big as you've chosen with your provider and, between the expirations you set and the eviction policy you choose, you would not run out of space. This would allow you to:

  1. share your cache entries between instances of your node app
  2. persist your cache entries across app deploys and restarts
  3. stop basing memory management of ambiguously-sized values on key count

I've been doing this for three years in a global, production application without issue. The overhead incurred on my application by fetching data from redis rather than from memory is usually somewhere between 5ms and 15ms. You can expect the same if your node application and redis host are operated by the same data center.

As for in-memory solutions, I'll review the ideas we discussed to see if there are any snags I've missed. If you decide to try out redis, let me know. I'm happy to help you set it up if needed.

I apologize, I haven't had time to tinker with an implementation for the in-memory solution yet.