golang/groupcache

Is it better to use circular doubly linked list to implement lru cache?

shidenggui opened this issue · 2 comments

I found Python's functools.lru_cache implemented by circular doubly linked list.

It avoids the cost of adding new node into the front and deleting old node from the last when the cache is full by using old root as new element and letting last should removed element become the new root.
see code below:

                elif full:
                    # Use the old root to store the new key and result.
                    oldroot = root
                    oldroot[KEY] = key
                    oldroot[RESULT] = result
                    # Empty the oldest link and make it the new root.
                    # ......
                    root = oldroot[NEXT]
                    # ......
                    root[KEY] = root[RESULT] = None
                    # Now update the cache dictionary.
                    del cache[oldkey]
                    # ......
                    cache[key] = oldroot

I found container/list is also implemented by circular doubly linked list.

// Element is an element of a linked list.
type Element struct {
    // Next and previous pointers in the doubly-linked list of elements.
    // To simplify the implementation, internally a list l is implemented
    // as a ring, such that &l.root is both the next element of the last
    // list element (l.Back()) and the previous element of the first list
    // element (l.Front()).
    next, prev *Element

    // The list to which this element belongs.
    list *List

    // The value stored with this element.
    Value interface{}
}

// List represents a doubly linked list.
// The zero value for List is an empty list ready to use.
type List struct {
    root Element // sentinel list element, only &root, root.prev, and root.next are used
    len  int     // current list length excluding (this) sentinel element
}

// Init initializes or clears list l.
func (l *List) Init() *List {
    l.root.next = &l.root
    l.root.prev = &l.root
    l.len = 0
    return l
}

But unfortunately, it isn't exposing root as a public variable.

So, is it better to use circular doubly linked list to implement lru cache for better performance when cache is full?

Is it better in what sense? Feel free to try it and report back.

What we have now works, though, and I'm not inclined to change it unless there's a compelling reason.

@bradfitz I reimplemented add new element logic by reuse root when cache is full and gain 1.3x performance speed up in my computer. See benchmark below.

func BenchmarkLongScan(b *testing.B) {
    var lru *Cache
    for i :=0; i < b.N; i++ {
        lru = New(5)
        for j := 0; j < 10000; j++ {
            lru.Add(j, j)
        }
    }
}

current:

goos: darwin
goarch: amd64
pkg: github.com/golang/groupcache/lru
BenchmarkLongScan-8            500       2522453 ns/op
PASS
ok      github.com/golang/groupcache/lru    1.530s

new:

goos: darwin
goarch: amd64
pkg: github.com/golang/groupcache/lru
BenchmarkLongScan-8           1000       1959793 ns/op
PASS
ok      github.com/golang/groupcache/lru    2.168s

I add a new method to list

// PushFrontByRotate inserts a new value v at the front of list l by rotate list
// and returns added element and removed value.
func (l *List) PushFrontByRotate(v interface{}) (added *Element, removed interface{}) {
    l.lazyInit()

    l.root.Value = v
    l.root = l.root.prev

    removed = l.root.Value
    l.root.Value = nil
    return l.root.next, removed
}

and rewrite Add logic when cache is full

// Add adds a value to the cache.
func (c *Cache) Add(key Key, value interface{}) {
    if c.cache == nil {
        c.cache = make(map[interface{}]*list.Element)
        c.ll = list.New()
    }
    if ee, ok := c.cache[key]; ok {
        c.ll.MoveToFront(ee)
        ee.Value.(*entry).value = value
        return
    }
    var ele *list.Element
    if c.MaxEntries != 0 && c.ll.Len() >= c.MaxEntries {
        var removed interface{}
        ele, removed = c.ll.PushFrontByRotate(&entry{key, value})
        c.removeValue(removed)
    } else {
        ele = c.ll.PushFront(&entry{key, value})
    }
    c.cache[key] = ele
}

func (c *Cache) removeValue(v interface{}) {
    kv := v.(*entry)
    delete(c.cache, kv.key)
    if c.OnEvicted != nil {
        c.OnEvicted(kv.key, kv.value)
    }
}

This is my change: 87ecc9a