dreymonde/Shallows

Cache Expiration

Closed this issue · 5 comments

Hi, very nice library. It's API has given me a lot of ideas about how to approach composition in my own app.

I've recently integrated with an ObjC caching library (SDWebImage). It supports cache expiration, and I was wondering if you planned on providing such functionality in this library?

It seems like some of this would be supported by making use of combined storages and possibly the "zipped" functionality. So you could have one storage contain keys and "inserted on" dates while another storage contains keys and the values you care about (for example, images). Cleaning up of expired keys could be handled on an as-needed basis by iterating through all the keys and comparing their "inserted on" dates with some expiry date.

Where I'm unsure, is how you'd implement something like a "Least Recently Used" strategy - wherein accessing a key updates it's "inserted on" date (you'd probably rename it to "updated on").

Imagine the following:

LRU Image Storage = Date Storage zipped with (In Memory Image Storage+Disk Image Storage)

If I read from this LRU Image Storage, I would want it to get from the In Memory storage first followed by the Disk storage if needed (that would be using the default pull strategy for combined storages). But how would I go about updating the entry in the Date Storage as the read occurred? I suppose a brute-force way to do so would be to just re-insert the image that was returned during the read, but that seems non-ideal from a usability/performance stand-point.

Hi @nmccann! Thanks for your thoughts, I really appreciate it.
I’m having a day off now, but I’ll get back at Sunday to discuss this very interesting question at length. So stay tuned! 🙂

Okay, I’m here now.
Cache expiration is, of course, a very useful and reasonable requirements for many apps, but it’s implementation is way too opinionated to be included in the library — I view Shallows as a toolbox that doesn’t perform any kind of dark magic under the hood.

As for the implementation itself, I think it can be done pretty easily. Here’s what I came up with:

final class LeastRecentlyUsedStorage<Key : Hashable, Value> : StorageProtocol {
    
    let storage: Storage<Key, Value>
    let metadata: Storage<Void, [Key : Date]>
    
    init(storage: Storage<Key, Value>, metadata: Storage<Void, [Key : Date]>) {
        self.storage = storage
        self.metadata = metadata
    }
    
    func retrieve(forKey key: Key, completion: @escaping (Result<Value>) -> ()) {
        storage.retrieve(forKey: key) { (result) in
            self.metadata.update({ $0[key] = Date() })
            completion(result)
        }
    }
    
    func set(_ value: Value, forKey key: Key, completion: @escaping (Result<Void>) -> ()) {
        storage.set(value, forKey: key) { (result) in
            self.metadata.update({ $0[key] = Date() })
            completion(result)
        }
    }
    
}

As you see, it showcases what I love about Shallows — you actually don’t know pretty much anything about where or how the values are stored, you only declare the logic of LRU component. It also doesn’t perform any kind of cache clearance itself, but you can write a simple extension to simplify this process:

extension LeastRecentlyUsedStorage {
    
    func keys(usedEarlierThan date: Date, completion: @escaping (Result<[Key]>) -> ()) {
        metadata.asReadOnlyStorage()
            .mapValues(to: [Key].self, { meta in
                let filtered = meta.filter({ key, value in value < date })
                return Array(filtered.keys)
            })
            .retrieve(completion: completion)
    }
    
}

And do any kind of logic needed outside the LeastRecentlyUsedStorage object.

Ping me if you have any questions!

That is a very nice solution! And yes, I agree that it would be difficult to come up with something that would handle all possible cases - better to give users the equipment needed to implement it themselves. While a specialized solution may give better performance/more compact storage (for instance, by storing the access date in extended file attributes), it would be so specialized that it might be impossible to mix with any of the other tools in your toolbox analogy.

One question I do have: Is there any particular reason why your metadata property is of type Storage<Void, [Key : Date]> rather than Storage<Key, Date>?

@nmccann

...it would be so specialized that it might be impossible to mix with any of the other tools in your toolbox analogy

That’s very true! That’s a tricky balance for sure, but I mostly try to keep my structures abstract.

Speaking of metadata — well, the only reason to have that Storage<Void, [Key : Date]> is to have the ability to iterate over keys.

Ah I see, because StorageProtocol only has methods for getting/setting one key at a time. Maybe that would be something useful for the toolbox - an IterableStorageProtocol. That said, providing implementations for all of the existing storage types would add a lot of extra surface area to the codebase when you could just use the workaround of Storage<Void, [Key : Date]>.

In any case, I think we can close this issue. Thank you for your help!