Proposal to Integrate SIEVE Eviction Algorithm
Opened this issue · 12 comments
Our team (@1a1a11a) has developed a new cache eviction algorithm, called SIEVE. It’s simple, efficient, and scalable.
Why SIEVE could be a great addition:
- Simplicity: Integrating SIEVE is straightforward, usually needing to change less than 20 lines of code on average.
- Efficiency: On skewed workloads, which are typical in web caching scenarios, SIEVE is top-notch.
- Cache Primitive: SIEVE is not just another algorithm; it's a primitive that could enhance or replace LRU/FIFO queues in advanced systems like LeCaR, TwoQ, ARC, and S3-FIFO.
Welcome to dive into the details on our website sievecache.com and on our SIEVE blog.
We would love to explore the possibility of integrating SIEVE into lru-rs. We believe it could be a beneficial addition to the library and the community.
Looking forward to your feedback!
This is very cool! I think adding SIEVE would be a great addition to the library. Would it be possible to do so in an opt-in manner so that by default the cache implements a standard LRU algorithm, but the user can opt-in to using SIEVE instead if they want to?
Totally on board with making SIEVE as an opt-in feature! Really looking forward to seeing how it spices up the library. Please let me know if any asistance is needed :-)
I probably won't have any time to implement it myself in the near future, but would be happy to review any pull requests if you're able to contribute.
Sure. I would love to implement it on my end. Thank you for the support :-)
@yazhuo in your testing I can't find any results for RRIP based eviction strategies. Are you aware of them and have you tested them? For example Arm chose Dynamic RRIP (DRRIP) as improvement over PLRU for their recent Arm Cortex-X3 micro-architecture https://i0.wp.com/chipsandcheese.com/wp-content/uploads/2023/09/neoverse_v2_ldst.png?ssl=1, initially developed by Intel https://dl.acm.org/doi/10.1145/1815961.1815971. AFAIK they represent the state of the art, so I was surprised you don't mention them in your work.
Hi @Voultapher, if I remember correctly, RRIP-based eviction policies are for hardware caches (set-associative caches). If one wants to implement them in software (non set-associative) caches, it will incur O(logN) complexity at each eviction.
Oh, I wasn't aware of that, I'd like to learn more about it and couldn't find sources that talk about this in more detail, can you point me to some?
The RRIP paper states that DRRIP is "scan-resistant and thrash-resistant", are there O(N) software eviction strategies with the same properties?
I am not aware of any work compare hardware cache and software cache. This is kindly separately community. I am not a hardware cache expert, so my understanding is limited.
RRIPs are scan-resistant. But it depends on you define scan, if a scan means a request sequence larger than the cache size, than SIEVE is actually scan-resistant. We made a mistake using the term "scan-resistant" to mean something else.
Hi @jeromefroe, would you recommend we add a new struct for the SIEVE cache or add as an option to LRUCache?
let mut cache = LruCache::new(NonZeroUsize::new(2).unwrap(), useSieve=true);
or
let mut cache = SieveCache::new(NonZeroUsize::new(2).unwrap());
BTW, the changes are minimal if we add it as an option (an old commit that changed LRU to SIEVE can be seen cacheMon@1c45ef7)
Some new results if you are interested https://observablehq.com/@1a1a11a/sieve-miss-ratio-plots
Hey @1a1a11a! I think an option would be the way to go since the it's just an implementation detail in how the cache expires items. Glad to here the changes need to take that approach are minimal as well!
Just realize that using the option would break existing code since we cannot set a default value for the use_sieve
parameters. Any suggestion?
Instead of adding a new parameter to the new
constructor, what do you think about adding a new constructor for creating caches that use SIEVE? If we were to implement a Builder we could add it as a method to that.