We would like to express our great gratitude for publishing the initial implementation of DBx1000 as open-source making our work possible: https://github.com/yxymit/DBx1000
Setup
Install build dependencies via package manager:
gcc (at leat version 9)
cmake (at least version 3.10)
libnuma-devel
Install dependencies (TBB) with Git Submodule. If an errors occurs with fetching repository try git submodule update --remote instead of git submodule update. If it's still not working try out adding submodule manually (git submodule add --force --name libs/tbb https://github.com/oneapi-src/oneTBB.git libs/tbb/). Check that libs/tbb folder contains afterwards the current version of the TBB Github repository.
git submodule init
git submodule update
Create build folder, run cmake and build the database
mkdir build
cd build
cmake ..
make TARGET
Execute run script in main folder (i.e. lab20-DBx1000) and pass run configuration file e.g. ./run_tpcc.sh configurations/test.conf. The results will be written to results/yyyy-MM-dd_hh-mm-ss. Be carefull that the corresponding cmake targets are build. If not make sure that the CMakeLists.txt file defines the target and the target is build in the build dir.
./run_tpcc.sh CONFIG-FILE
The run script configuration file should contain the following parameters:
Name
Definition
Example Value
WHs
List of different warehouse numbers
WHs="4 512 1024"
threads
List of number of threads
threads="1 2 4 8"
remote_news
List
remote_news="0 1 5 10"
CCs
List of currency control algorithms
CCs="WAIT_DIE NO_WAIT"
NO_HTs
Run w/o or w/ hyperthreading (0=hyperthreading, 1=no hyperthreading)
NO_HTs="0 1"
SMALLs
Run with small relations
SMALLs="0 1"
Configuration
DBMS configurations can be changed in the config.h file. The configuration file is set in the first lines of CMakeList.txt file.
Categories:
Simulation & Hardware
Concurrency Control
Specific Configuration for INDEX
Specific Configuration for DL_DETECT
Specific Configuration for TIMESTAMP
Specific Configuration for MVCC
Specific Configuration for OCC
Specific Configuration for TICTOC
Specific Configuration for TICTOC & SILO
Specific Configuration for OCC & TICTOC & SILO
Specific Configuration for HSTORE
Specific Configuration for VLL
Logging
Benchmark
Specific Configuration for YCSB
Specific Configuration for TPCC
Centralized CC Management - Should be ignored
TestsCases
Debug Info
Spin Waiting
Thread Alloc Hawk
Misc
Simulation & Hardware
Configuration
Definition
Values
THREAD_CNT
Number of worker threads running in the database
int
PART_CNT
Number of logical partitions in the system
int
VIRTUAL_PART_CNT
Each transaction only accesses 1 virtual partition. But the lock/ts manager and index are not aware of such partitioning. VIRTUAL_PART_CNT describes the request distribution and is only used to generate queries. For HSTORE, VIRTUAL_PART_CNT should be the same as PART_CNT.
int
PAGE_SIZE
Memory page size
int
CL_SIZE
Cache Line size of hardware
int
CPU_FREQ
CPU Frequency used to get accurate timing info
int (in GHz)
WARMUP
Number of transactions to run for warmup
int
WORKLOAD
Supported workloads include YCSB and TPCC
YCSB TPCC TEST
PRT_LAT_DISTR
Print the transaction latency distribution
true|false
STATS_ENABLE
Print statistics
true|false
TIME_ENABLE
Use realtime for measurements, otherwise use fake time.
true|false
MEM_ALLIGN
Allocated blocks are aligned to MEM_ALLIGN bytes
int (in bytes)
THREAD_ARENA_SIZE
int
MEM_PAD
Enable memory padding to avoid false sharing
true|false
PART_ALLOC
true|false
MEM_SIZE
Deprecated (never used)
int
Memory Allocation
Configuration
Definition
Values
MALLOC_TYPE
Memory Allocator to use
MEM_ALLOC HAWK_ALLOC JE_MALLOC(future)
Specific Configuration for MEM_ALLOC
Configuration
Definition
Values
THREAD_ALLOC
Per thread allocator. If false std::malloc is used.
true|false
NO_FREE
Free no memory
true|false
Specific Configuration for HAWK_ALLOC
Configuration
Definition
Values
THREAD_ALLOC_HAWK_INSERT
true|false
HAWK_ALLOC_CAPACITY
Hawked memory capacity.
int
Concurrency Control
Configuration
Definition
Values
CC_ALG
Concurrency Control Algorithm
NO_WAIT No-wait two phase locking WAIT_DIE Wait-and-die two phase locking DL_DETECT TIMESTAMP basic T/O MVCC multi-version T/O HSTORE H-Store OCC optimistic concurrency control TICTOC SILO VLL HEKATON
ISOLATION_LEVEL
Isolation Level
SERIALIZABLE SNAPSHOT REPEATABLE_READ
KEY_ORDER
All transactions acquire tuples according to the primary key order.
true|false
ROLL_BACK
Roll back the modifications if a transaction aborts
true|false
CENTRAL_MAN
Per-row lock/ts management or central lock/ts management
true|false
BUCKET_CNT
int
ABORT_PENALTY
int
ABORT_BUFFER_SIZE
int
ABORT_BUFFER_ENABLE
true|false
Specific Configuration for INDEX
Configuration
Definition
Values
CENTRAL_INDEX
Centralized index structure (part count for index initialization is 1)
Number of spinning cycles for exponential back-off for bounded spinning
int
Outputs
txn_cnt: The total number of committed transactions. This number is close to but smaller than THREAD_CNT * MAX_TXN_PER_PART. When any worker thread commits MAX_TXN_PER_PART transactions, all the other worker threads will be terminated. This way, we can measure the steady state throughput where all worker threads are busy.
abort_cnt: The total number of aborted transactions. A transaction may abort multiple times before committing. Therefore, abort_cnt can be greater than txn_cnt.
run_time: The aggregated transaction execution time (in seconds) across all threads. run_time is approximately the program execution time * THREAD_CNT. Therefore, the per-thread throughput is txn_cnt / run_time and the total throughput is txn_cnt / run_time * THREAD_CNT.
time_{wait, ts_alloc, man, index, cleanup, query}: Time spent on different components of DBx1000. All numbers are aggregated across all threads.
time_abort: The time spent on transaction executions that eventually aborted.