/skiplist-survey

A comparison of skip lists written in Go

Primary LanguageGoMIT LicenseMIT

Survey of Skip List Implementations

Here is a brief summary of skip list packages available in Go that you may consider using after a quick Google/Github search. If you know of any others, please contact me so I can add them here.

Some things most of the packages have in common:

  • Keys are int type, which are 32 or 64 bit depending on GOARCH
  • Values are generally of type interface {}, so they accept any datatype (Go does not have generics).
  • The probability of adding new nodes to each linked level is P. The values vary from 0.25 to 0.5. This is an important parameter for performance tuning and memory usage.

Here are some brief notes on each implementation:

  • github.com/mtchavez/skiplist
    • Values are type []byte, which almost always means conversion.
    • Global constant for P value = 0.25, cannot be changed at runtime.
  • github.com/huandu/skiplist
    • Globally sets P to almost 0.25 (using bitmasks and shifting) and can be changed at runtime.
    • You must specify a comparator and type for keys when creating the list.
    • Keys are of type interface{}
    • Not threadsafe
  • github.com/zhenjl/skiplist
    • Adjustable P value and max level per list.
    • Adjustable insert level probability per list.
    • Allows duplicates stored at a single key and therefore does not have an update operation.
    • When creating a list, you specify a comparator. It has many built-in that are generated by running an external script that writes out a Go source file with the interfaces.
    • Uses separate search and insert fingers to speed up finding highly local keys consecutively.
    • Threadsafe but fingers are shared as well across all lists
  • github.com/golang-collections/go-datastructures/slice/skip
    • Intelligently sets maximum level based on key's datatype (uint8 up to uint64)
    • More complex interface; you must define an Entry type that implements the interface it specifies with a comparator
    • P value is a global constant, 0.5
  • github.com/ryszard/goskiplist
    • P value is a global constant, 0.25
    • Very straightforward implementation and interface
    • Not threadsafe
  • github.com/sean-public/fast-skiplist
    • Fastest concurrent implementation in all benchmarks; huandu is very close in every benchmark but it is not threadsafe.
    • See fast-skiplist's README for details on how this is achieved.

Benchmarks

Running the benchmarks found in this repo locally is easy:

go get github.com/sean-public/skiplist-survey
go install github.com/sean-public/skiplist-survey
skiplist-survey > output.csv

The results are in CSV format for easy charting and analysis.

Here is a summary of results I recorded on a Macbook Pro 15 with a 2.7 GHz Intel Core i7 and 16GB RAM. It takes over an hour to run all the benchmarks.

best inserts chart

The chart above shows the best-case insert speeds. The vertical axis is nanoseconds per operation and the horizontal is the number of items in the list. These are the "best" inserts because they happen at the front of the list, which shouldn't require any searching. The difference in speed here demonstrates the overhead each package introduces in even the most basic operations.

worst inserts chart

Worst-case inserts. These inserts are at the end of the list, requiring searching all the way to the end and then adding the new node. As you can see, mtchavez does not scale as well as the other implementations, which only show a small variance even after millions of nodes are added.

average search chart

Average search speed. You can see mtchavez again is not searching in O(log n) or even O(n). zhenjl also appears to have a lot of overhead in its search compared to the other implementations.

worst case delete chart

Worst case deletions. In this benchmark, a skip list of a given length is created and then every item is removed, starting from the last one and moving to the front.

zoom worse cases deletions

If we zoom in, we can see the speed differences between the fastest implementations a bit better. The fastest overall is sean, which averages 10-50ns faster in all operations than the next fastest implementation.

Todo

  • (in progress now) Add multithreaded benchmarks with varying concurrency and read/write load proportions.

  • Use very tall (max height) skiplists to demonstrate stress caused by multiple calls to random functions on insert.

  • Record total and high-water mark memory use.

  • Benchmark concurrent inserts on multiple lists to stress globally-locked PRNG in most implementations.