/pudge

Fast and simple key/value store written using Go's standard library

Primary LanguageGoMIT LicenseMIT

Documentation Go Report Card Build Status Mentioned in Awesome Go

Table of Contents

Description

Package pudge is a fast and simple key/value store written using Go's standard library.

It presents the following:

  • Supporting very efficient lookup, insertions and deletions
  • Performance is comparable to hash tables
  • Ability to get the data in sorted order, which enables additional operations like range scan
  • Select with limit/offset/from key, with ordering or by prefix
  • Safe for use in goroutines
  • Space efficient
  • Very short and simple codebase
  • Well tested, used in production

pudge

Usage

package main

import (
	"log"

	"github.com/recoilme/pudge"
)

func main() {
	// Close all database on exit
	defer pudge.CloseAll()

	// Set (directories will be created)
	pudge.Set("../test/test", "Hello", "World")

	// Get (lazy open db if needed)
	output := ""
	pudge.Get("../test/test", "Hello", &output)
	log.Println("Output:", output)

	ExampleSelect()
}


//ExampleSelect
func ExampleSelect() {
	cfg := &pudge.Config{
		SyncInterval: 1} // every second fsync
	db, err := pudge.Open("../test/db", cfg)
	if err != nil {
		log.Panic(err)
	}
	defer db.DeleteFile()
	type Point struct {
		X int
		Y int
	}
	for i := 100; i >= 0; i-- {
		p := &Point{X: i, Y: i}
		db.Set(i, p)
	}
	var point Point
	db.Get(8, &point)
	log.Println(point)
	// Output: {8 8}
	// Select 2 keys, from 7 in ascending order
	keys, _ := db.Keys(7, 2, 0, true)
	for _, key := range keys {
		var p Point
		db.Get(key, &p)
		log.Println(p)
	}
	// Output: {8 8}
	// Output: {9 9}
}

Cookbook

  • Store data of any type. Pudge uses Gob encoder/decoder internally. No limits on keys/values size.
pudge.Set("strings", "Hello", "World")
pudge.Set("numbers", 1, 42)

type User struct {
	Id int
	Name string
}
u := &User{Id: 1, Name: "name"}
pudge.Set("users", u.Id, u)
  • Pudge is stateless and safe for use in goroutines. You don't need to create/open files before use. Just write data to pudge, don't worry about state. web server example

  • Pudge is parallel. Readers don't block readers, but a writer - does, but by the stateless nature of pudge it's safe to use multiples files for storages.

Illustration from slowpoke (based on pudge)

  • Default store system: like memcache + file storage. Pudge uses in-memory hashmap for keys, and writes values to files (no value data stored in memory). But you may use inmemory mode for values, with custom config:
cfg = pudge.DefaultConfig()
cfg.StoreMode = 2
db, err := pudge.Open(dbPrefix+"/"+group, cfg)
...
db.Counter(key, val)

In that case, all data is stored in memory and will be stored on disk only on Close.

Example server for highload, with http api

  • You may use pudge as an engine for creating databases.

Example database

  • Don't forget to close all opened databases on shutdown/kill.
 	// Wait for interrupt signal to gracefully shutdown the server 
	quit := make(chan os.Signal)
	signal.Notify(quit, os.Interrupt, os.Kill)
	<-quit
	log.Println("Shutdown Server ...")
	if err := pudge.CloseAll(); err != nil {
		log.Println("Pudge Shutdown err:", err)
	}

example recovery function for gin framework

  • Pudge has a primitive select/query engine.
// Select 2 keys, from 7 in ascending order
   keys, _ := db.Keys(7, 2, 0, true)
// select keys from db where key>7 order by keys asc limit 2 offset 0
  • Pudge will work well on SSD or spined disks. Pudge doesn't eat memory or storage or your sandwich. No hidden compaction/rebalancing/resizing and so on tasks. No LSM Tree. No MMap. It's a very simple database with less than 500 LOC. It's good for simple social network or highload system

Disadvantages

  • No transaction system. All operations are isolated, but you don't may batching them with automatic rollback.
  • Keys function (select/query engine) may be slow. Speed of query may vary from 10ms to 1sec per million keys. Pudge don't use BTree/Skiplist or Adaptive radix tree for store keys in ordered way on every insert. Ordering operation is "lazy" and run only if needed.
  • If you need storage or database for hundreds of millions keys - take a look at Sniper or b52. They are optimized for highload (pudge - not).
  • No fsync on every insert. Most of database fsync data by the timer too
  • Deleted data don't remove from physically (but upsert will try to reuse space). You may shrink database only with backup right now
pudge.BackupAll("backup")
  • Keys automatically convert to binary and ordered with binary comparator. It's simple for use, but ordering will not work correctly for negative numbers for example
  • Author of project don't work at Google or Facebook and his name not Howard Chu or Brad Fitzpatrick. But I'm open for issue or contributions.

Motivation

Some databases very well for writing. Some of the databases very well for reading. But pudge is well balanced for both types of operations. It has small cute api, and don't have hidden graveyards. It's just hashmap where values written in files. And you may use one database for in-memory/persistent storage in a stateless stressfree way

Benchmarks

All tests here

Some tests, MacBook Pro (Retina, 13-inch, Early 2015)

Test 1

Number of keys: 1000000 Minimum key size: 16, maximum key size: 64 Minimum value size: 128, maximum value size: 512 Concurrency: 2

pogreb goleveldb bolt badgerdb pudge slowpoke pudge(mem)
1M (Put+Get), seconds 187 38 126 34 23 23 2
1M Put, ops/sec 5336 34743 8054 33539 47298 46789 439581
1M Get, ops/sec 1782423 98406 499871 220597 499172 445783 1652069
FileSize,Mb 568 357 552 487 358 358 358

Test 4

Number of keys: 10000000 Key size: 8 Value size: 16 Concurrency: 100

goleveldb badgerdb pudge
10M (Put+Get), seconds 165 120 243
10M Put, ops/sec 122933 135709 43843
10M Get, ops/sec 118722 214981 666067
FileSize,Mb 312 1370 381