/calvin

Calvin is a scalable transactional database system that leverages determinism to guarantee active replication and full ACID-compliance of distributed transactions without two-phase commit. Most of the code is for VLDB 2014 paper: 《An Evaluation of the Advantages and Disadvantages of Deterministic Database Systems》.

Primary LanguageC++

This material is based upon work supported by the National Science Foundation under Grant Number IIS-1249722. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Copyright (c) 2014 Yale University. All rights reserved.
Developed by: Alexander Thomson, Kun Ren, Thaddeus Diamond, Shu-Chun Weng, Philip Shao, Prof. Daniel Abadi

Calvin is a massively-scalable transactional database solution that allows
for concurrent, multi-threaded application execution while maintaining
guarantees of ACIDity.  In order to do this, Calvin uses a pre-determined
serial ordering of transactions to which every node must guarantee equivalence
to. Most of the code is for VLDB 2014 paper: <An Evaluation of the Advantages and Disadvantages of Deterministic Database Systems>
(The paper is available at: http://www.vldb.org/pvldb/vol7/p821-ren.pdf)

Prerequisites
  - GNU/Linux distro >= 2.6.37.6
  - G++ >= 4.5.1
  - Satisfy all dependencies of the following external libraries
      -# GoogleTest      - Google's Unit Testing Framework
          - Object Linking: ext/googletest/lib/.libs
          - Header Include: ext/googletest/include
      -# ProtocolBuffers - Google's Framework for Serializable PODS
          - Object Linking: ext/protobuf/src/.libs
          - Header Include: ext/protobuf/src
      -# ZeroMQ          - Efficient Message Passing System
          - Object Linking: ext/zeromq/src/.libs
          - Header Include: ext/zeromq/include
      -# Zookeeper       - Apache Implementation of Paxos protocol
          - Object Linking: ext/zookeeper/.libs
          - Header Include: ext/zookeeper/include ext/zookeeper/generated

The source folder is comprised of several scripts and directories:
     - README            - This file
     - INSTALL           - A detailed (yet slightly outdated) list of installation instructions
     - ./install-ext     - A script to install all external libraries linked to this project
     - deploy-run.conf   - Include the machines which Calvin run on
     - ext/              - Contains several external libraries used in Calvin that must be compiled and linked to source
     - src_calvin/       - The basic Calvin codebase
     - src_calvin_3_partitions/    - The Calvin codebase that each distributed transaction spans 3 partitions
     - src_calvin_4_partitions/    - The Calvin codebase that each distributed transaction spans 4 partitions
     - src_calvin_vector_vll/      - The calvin codebase that uses VLL to implement Lock Manager thread
     - src_dependent_remote_index/ - The calvin codebase that was used to test dependent transactions
     - src_dependent_variable_sized_reconnaissance_phases  - The calvin codebase that varies the amount of work that needs to be done before all dependencies have been resolved
     - src_single_thread_vll       - The single threaded VLL implementation
     - src_traditional_2pl_2pc     - The basic tradtional nondeterministic implementation based on Calvin codebase
     - src_traditional_2pl_2pc_3_partitions     - The traditional nondeterministic implementation that each distributed transactions spans 3 partitions
     - src_traditional_2pl_2pc_4_partitions     - The traditional nondeterministic implementation that each distributed transactions spans 3 partitions
(TODO: We plan to combine some folders related to src_calvin_* into one codebase, and so as to src_traditional_*)


Installation
  In order to compile external libraries associated w/Calvin, please run: 
    $ ./install-ext
  To compile the source, please run:
    $ mv src_*** src
    $ cd src
    $ make -j
  Two directories will be created: bin/, obj/ and logs/.
   - obj/      - Where all the .o and some .d (dependency files) are sent to
   - bin/      - Where all the binary and some .d (dependency files) are written

  In order to run an executable (including an individual test) simply invoke the
  appropriate binary file from the command line.  For example, if you wanted to
  run calvin_ctl (the executable for launching Calvin), you would invoke from
  the root directory:
    $ bin/deployment/cluster -c deploy-run.conf -p src/deployment/portfile -d bin/deployment/db m 0

   If you only run it on one machine, just run the command from the root directory:
   $ bin/deployment/db 0 m 0

  Note that: Since all experiments we did before were ran on 8 cores machines, current codebase only can be running on 8 cores machines, we will make it more general in the future. If you want to run it on multiple machines, you should make sure each machine has the same user name and each machine is able to ssh the other machines without password, like "ssh 128.26.232.18". Also you need to make sure each machine has the exactly same code, and you should build the same code on each machine.

  And there are some import parameters you need to edit :
   - src/deployment/main.cc, #define HOT ***: Set amount of Hot records for micorbenchmark, it is used to vary contention index (100 means contention index = 0.01);
   - src/sequencer/sequencer.h: #define MAX_BATCH_SIZE *** : Set batch size per 10 ms epoch , set it a little bigger than the actually throughput(200 means every second the sequencer creates 20K transactions)     

  You should make sure that your LD_LIBRARY_PATH includes the object files noted in the dependencies above. And you need to edit deploy-run.conf to include the machines which Calvin run on(The port should be same with the port in the src/deployment/portfile).

  Currently this codebase is maintained by Kun Ren(kun@cs.yale.edu).