/cockroach

A Scalable, Survivable, Strongly-Consistent SQL Database

Primary LanguageGoApache License 2.0Apache-2.0

logo

Circle CI GoDoc Project Status Gitter

A Scalable, Survivable, Strongly-Consistent SQL Database

What is CockroachDB?

CockroachDB is a distributed SQL database built on a transactional and strongly-consistent key-value store. It scales horizontally; survives disk, machine, rack, and even datacenter failures with minimal latency disruption and no manual intervention; supports strongly-consistent ACID transactions; and provides a familiar SQL API for structuring, manipulating, and querying data.

For more details, see our FAQ, documentation, and design overview.

Status

CockroachDB is currently in beta. See our Roadmap and Issues for a list of features planned or in development.

Quickstart

  1. Install Cockroach DB.

  2. Start a local cluster with three nodes listening on different ports:

    $ ./cockroach start --insecure &
    $ ./cockroach start --insecure --store=cockroach-data2 --port=26258 --http-port=8081 --join=localhost:26257 &
    $ ./cockroach start --insecure --store=cockroach-data3 --port=26259 --http-port=8082 --join=localhost:26257 &
  3. Start the built-in SQL client as an interactive shell:

    $ ./cockroach sql --insecure
    # Welcome to the cockroach SQL interface.
    # All statements must be terminated by a semicolon.
    # To exit: CTRL + D.
  4. Run some CockroachDB SQL statements:

    root@:26257> CREATE DATABASE bank;
    CREATE DATABASE
    
    root@:26257> SET DATABASE = bank;
    SET
    
    root@:26257> CREATE TABLE accounts (id INT PRIMARY KEY, balance DECIMAL);
    CREATE TABLE
    
    root@26257> INSERT INTO accounts VALUES (1234, DECIMAL '10000.50');
    INSERT 1
    
    root@26257> SELECT * FROM accounts;
    +------+----------+
    |  id  | balance  |
    +------+----------+
    | 1234 | 10000.50 |
    +------+----------+
  5. Checkout the admin UI by pointing your browser to http://<localhost>:8080.

  6. CockroachDB makes it easy to secure a cluster.

Client Drivers

CockroachDB supports the PostgreSQL wire protocol, so you can use any available PostgreSQL client drivers to connect from various languages. For recommended drivers that we've tested, see Install Client Drivers.

Deployment

  • Manual - Steps to deploy a CockroachDB cluster manually on multiple machines.

  • Cloud - A sample configuration to run an insecure CockroachDB cluster on AWS using Terraform.

Get In Touch

When you see a bug or have improvements to suggest, please open an issue.

For development-related questions and anything else, there are two easy ways to get in touch:

Contributing

We're an open source project and welcome contributions.

  1. See CONTRIBUTING.md to get your local environment set up.

  2. Take a look at our open issues, in particular those with the helpwanted label.

  3. Review our style guide and follow our code reviews to learn about our style and conventions.

  4. Make your changes according to our code review workflow.

Talks

The best ones to start with:

Other talks of interest:

Design

This is an overview. For an in-depth discussion of the design and architecture, see the full design doc. For another quick design overview, see the CockroachDB tech talk slides.

Overview

CockroachDB is a distributed SQL database built on top of a transactional and consistent key:value store. The primary design goals are support for ACID transactions, horizontal scalability and survivability, hence the name. CockroachDB implements a Raft consensus algorithm for consistency. It aims to tolerate disk, machine, rack, and even datacenter failures with minimal latency disruption and no manual intervention. CockroachDB nodes (RoachNodes) are symmetric; a design goal is homogeneous deployment (one binary) with minimal configuration.

CockroachDB implements a single, monolithic sorted map from key to value where both keys and values are byte strings (not unicode). CockroachDB scales linearly (theoretically up to 4 exabytes (4E) of logical data). The map is composed of one or more ranges and each range is backed by data stored in RocksDB (a variant of LevelDB), and is replicated to a total of three or more CockroachDB servers. Ranges are defined by start and end keys. Ranges are merged and split to maintain total byte size within a globally configurable min/max size interval. Range sizes default to target 64M in order to facilitate quick splits and merges and to distribute load at hotspots within a key range. Range replicas are intended to be located in disparate datacenters for survivability (e.g. { US-East, US-West, Japan }, { Ireland, US-East, US-West} , { Ireland, US-East, US-West, Japan, Australia }).

Single mutations to ranges are mediated via an instance of a distributed consensus algorithm to ensure consistency. We’ve chosen to use the Raft consensus algorithm. All consensus state is stored in RocksDB.

A single logical mutation may affect multiple key/value pairs. Logical mutations have ACID transactional semantics. If all keys affected by a logical mutation fall within the same range, atomicity and consistency are guaranteed by Raft; this is the fast commit path. Otherwise, a non-locking distributed commit protocol is employed between affected ranges.

CockroachDB provides snapshot isolation (SI) and serializable snapshot isolation (SSI) semantics, allowing externally consistent, lock-free reads and writes--both from an historical snapshot timestamp and from the current wall clock time. SI provides lock-free reads and writes but still allows write skew. SSI eliminates write skew, but introduces a performance hit in the case of a contentious system. SSI is the default isolation; clients must consciously decide to trade correctness for performance. CockroachDB implements a limited form of linearalizability, providing ordering for any observer or chain of observers.

Similar to Spanner directories, CockroachDB allows configuration of arbitrary zones of data. This allows replication factor, storage device type, and/or datacenter location to be chosen to optimize performance and/or availability. Unlike Spanner, zones are monolithic and don’t allow movement of fine grained data on the level of entity groups.

SQL - NoSQL - NewSQL Capabilities

SQL - NoSQL - NewSQL Capabilities