BadgerDB
BadgerDB is an embeddable, persistent and fast key-value (KV) database written in pure Go. It is the underlying database for Dgraph, a fast, distributed graph database. It's meant to be a performant alternative to non-Go-based key-value stores like RocksDB.
Use Discuss Issues for reporting issues about this repository.
Project Status [March 24, 2020]
Badger is stable and is being used to serve data sets worth hundreds of
terabytes. Badger supports concurrent ACID transactions with serializable
snapshot isolation (SSI) guarantees. A Jepsen-style bank test runs nightly for
8h, with --race
flag and ensures the maintenance of transactional guarantees.
Badger has also been tested to work with filesystem level anomalies, to ensure
persistence and consistency. Badger is being used by a number of projects which
includes Dgraph, Jaeger Tracing, UsenetExpress, and many more.
The list of projects using Badger can be found here.
Badger v1.0 was released in Nov 2017, and the latest version that is data-compatible with v1.0 is v1.6.0.
Badger v2.0 was released in Nov 2019 with a new storage format which won't be compatible with all of the v1.x. Badger v2.0 supports compression, encryption and uses a cache to speed up lookup.
The Changelog is kept fairly up-to-date.
For more details on our version naming schema please read Choosing a version.
Table of Contents
Getting Started
Installing
To start using Badger, install Go 1.12 or above. Badger v2 needs go modules. Run the following command to retrieve the library.
$ go get github.com/dgraph-io/badger/v3
This will retrieve the library.
https://github.com/DataDog/zstd for compression and it requires gcc/cgo. If you wish to use badger without gcc/cgo, you can run CGO_ENABLED=0 go get github.com/dgraph-io/badger/v3
which will download badger without the support for ZSTD compression algorithm.
Note: Badger does not directly use CGO but it relies on Installing Badger Command Line Tool
Download and extract the latest Badger DB release from https://github.com/dgraph-io/badger/releases and then run the following commands.
$ cd badger-<version>/badger
$ go install
This will install the badger command line utility into your $GOBIN path.
Choosing a version
BadgerDB is a pretty special package from the point of view that the most important change we can make to it is not on its API but rather on how data is stored on disk.
This is why we follow a version naming schema that differs from Semantic Versioning.
- New major versions are released when the data format on disk changes in an incompatible way.
- New minor versions are released whenever the API changes but data compatibility is maintained. Note that the changes on the API could be backward-incompatible - unlike Semantic Versioning.
- New patch versions are released when there's no changes to the data format nor the API.
Following these rules:
- v1.5.0 and v1.6.0 can be used on top of the same files without any concerns, as their major version is the same, therefore the data format on disk is compatible.
- v1.6.0 and v2.0.0 are data incompatible as their major version implies, so files created with v1.6.0 will need to be converted into the new format before they can be used by v2.0.0.
For a longer explanation on the reasons behind using a new versioning naming schema, you can read VERSIONING.md.
Badger Documentation
Badger Documentation is available at https://dgraph.io/docs/badger
Resources
Blog Posts
- Introducing Badger: A fast key-value store written natively in Go
- Make Badger crash resilient with ALICE
- Badger vs LMDB vs BoltDB: Benchmarking key-value databases in Go
- Concurrent ACID Transactions in Badger
Design
Badger was written with these design goals in mind:
- Write a key-value database in pure Go.
- Use latest research to build the fastest KV database for data sets spanning terabytes.
- Optimize for SSDs.
Badger’s design is based on a paper titled WiscKey: Separating Keys from Values in SSD-conscious Storage.
Comparisons
Feature | Badger | RocksDB | BoltDB |
---|---|---|---|
Design | LSM tree with value log | LSM tree only | B+ tree |
High Read throughput | Yes | No | Yes |
High Write throughput | Yes | Yes | No |
Designed for SSDs | Yes (with latest research 1) | Not specifically 2 | No |
Embeddable | Yes | Yes | Yes |
Sorted KV access | Yes | Yes | Yes |
Pure Go (no Cgo) | Yes | No | Yes |
Transactions | Yes, ACID, concurrent with SSI3 | Yes (but non-ACID) | Yes, ACID |
Snapshots | Yes | Yes | Yes |
TTL support | Yes | Yes | No |
3D access (key-value-version) | Yes4 | No | No |
1 The WISCKEY paper (on which Badger is based) saw big wins with separating values from keys, significantly reducing the write amplification compared to a typical LSM tree.
2 RocksDB is an SSD optimized version of LevelDB, which was designed specifically for rotating disks. As such RocksDB's design isn't aimed at SSDs.
3 SSI: Serializable Snapshot Isolation. For more details, see the blog post Concurrent ACID Transactions in Badger
4 Badger provides direct access to value versions via its Iterator API. Users can also specify how many versions to keep per key via Options.
Benchmarks
We have run comprehensive benchmarks against RocksDB, Bolt and LMDB. The benchmarking code, and the detailed logs for the benchmarks can be found in the badger-bench repo. More explanation, including graphs can be found the blog posts (linked above).
Projects Using Badger
Below is a list of known projects that use Badger:
- Dgraph - Distributed graph database.
- Jaeger - Distributed tracing platform.
- go-ipfs - Go client for the InterPlanetary File System (IPFS), a new hypermedia distribution protocol.
- Riot - An open-source, distributed search engine.
- emitter - Scalable, low latency, distributed pub/sub broker with message storage, uses MQTT, gossip and badger.
- OctoSQL - Query tool that allows you to join, analyse and transform data from multiple databases using SQL.
- Dkron - Distributed, fault tolerant job scheduling system.
- smallstep/certificates - Step-ca is an online certificate authority for secure, automated certificate management.
- Sandglass - distributed, horizontally scalable, persistent, time sorted message queue.
- TalariaDB - Grab's Distributed, low latency time-series database.
- Sloop - Salesforce's Kubernetes History Visualization Project.
- Immudb - Lightweight, high-speed immutable database for systems and applications.
- Usenet Express - Serving over 300TB of data with Badger.
- gorush - A push notification server written in Go.
- Dispatch Protocol - Blockchain protocol for distributed application data analytics.
- GarageMQ - AMQP server written in Go.
- RedixDB - A real-time persistent key-value store with the same redis protocol.
- BBVA - Raft backend implementation using BadgerDB for Hashicorp raft.
- Fantom - aBFT Consensus platform for distributed applications.
- decred - An open, progressive, and self-funding cryptocurrency with a system of community-based governance integrated into its blockchain.
- OpenNetSys - Create useful dApps in any software language.
- HoneyTrap - An extensible and opensource system for running, monitoring and managing honeypots.
- Insolar - Enterprise-ready blockchain platform.
- IoTeX - The next generation of the decentralized network for IoT powered by scalability- and privacy-centric blockchains.
- go-sessions - The sessions manager for Go net/http and fasthttp.
- Babble - BFT Consensus platform for distributed applications.
- Tormenta - Embedded object-persistence layer / simple JSON database for Go projects.
- BadgerHold - An embeddable NoSQL store for querying Go types built on Badger
- Goblero - Pure Go embedded persistent job queue backed by BadgerDB
- Surfline - Serving global wave and weather forecast data with Badger.
- Cete - Simple and highly available distributed key-value store built on Badger. Makes it easy bringing up a cluster of Badger with Raft consensus algorithm by hashicorp/raft.
- Volument - A new take on website analytics backed by Badger.
- KVdb - Hosted key-value store and serverless platform built on top of Badger.
- Terminotes - Self hosted notes storage and search server - storage powered by BadgerDB
- Pyroscope - Open source confinuous profiling platform built with BadgerDB
- Veri - A distributed feature store optimized for Search and Recommendation tasks.
- bIter - A library and Iterator interface for working with the
badger.Iterator
, simplifying from-to, and prefix mechanics. - ld - (Lean Database) A very simple gRPC-only key-value database, exposing BadgerDB with key-range scanning semantics.
If you are using Badger in a project please send a pull request to add it to the list.
Contributing
If you're interested in contributing to Badger see CONTRIBUTING.md.
Contact
- Please use discuss.dgraph.io for questions, feature requests and discussions.
- Please use Github issue tracker for filing bugs or feature requests.
- Follow us on Twitter @dgraphlabs.