COP6726 - Database System Implementation - Database From Scratch
- sqlike/ - project root directory.
- bin/ - contains bin files generated through
DBFile::Load
method.- 1gb/ - contains bin files for 1gb dataset.
- 10mb/ - contains bin files for 10mb dataset.
- build/ - contains compiled binaries and executables. This is the directory from where we'll be executing the commands to run our project.
- docs/ - contains documentation for project description given by the Professor. Also contains screenshots of the results.
- files/ - contains tbl dataset files generated through TPC-H dbgen. Also contains
catalog
file which holds the schema definitions for our dataset.- 1gb/ - contains tbl files for 1gb dataset.
- 10mb/ - contains tbl files for 10mb dataset.
- src/ - contains project source code.
- test-cases/ - contains test files and test script for the project.
- bin/ - contains bin files generated through
Note: The folders bin
, build
, and files/1gb
has not been pushed to GitHub due to storage limitations.
- OS: Windows with WSL (windows subsystem for linux). I downloaded Ubuntu 20.04LTS from the Microsoft store.
- IDE: CLion with configuration done following this and this. I also made some changes in this configuration and provided my setup screenshots(
docs/toolchains-setup.jpg
&docs/cmake-setup.jpg
) for reference. - GTest: In addition to this tutorial, I have also configured GTest by running
sudo apt-get install libgtest-dev
inside Ubuntu terminal.
Note: All this configuration commands needs to be run on Ubuntu terminal. I have used cmake
instead of make
, so that I can debug using CLion.
Run the following commands in order to run this project on your machine.
-
git clone
https://github.com/phoenix-254/sqlike.git
. -
cd sqlike/src/
- move to src folder. -
cmake -B../build -H.
- this will generate build folder with all the required files usingCMakeLists.txt
file insrc
folder. (fromsrc/
directory) -
cd ../build/
- move to build folder. -
cmake --build . --target sqlike-test
- compiles the code and generates an executable. (frombuild/
directory)The
sqlike-test
here is the name of the executable you want to generate, it can beclean
or any other defined in yourCMakeLists
. e.g in order to clean we can usecmake --build . --target clean
. -
./sqlike-test
- to run the code. (frombuild/
directory) -
./run.sh
- to run the test script and generateoutput1.txt
. (fromtest-cases
directory)
Note: You must create empty bin
folder with two sub-folders(1gb
& 10mb
) inside the root folder as depicted in above directory structure prior to runnig this project. Also, you have to generate 1gb tbl files using TPC-H dbgen yourself and put it in files/1gb/
folder if you want to test against 1gb dataset.
- Record: This class implements the actual objects that your database will store and stores all of the data in each record as a flat bit string.
- Page: This is the in-memory realization of a database page; a page is essentially a collection of database records. Previously inside the File class.
- File: This is a disk-based container class that holds an array of pages.
- Comparison: This class implements many of the standard operations that must be provided by the database record manager; that is, they will allow your database to semantically interpret the records that it stores. There is one class called CNF, which is constructed from the parse tree for a conjunctive normal form predicate. This class tells the database system how to apply a user-supplied conjunctive normal form expression to a given records. There is another class called OrderMaker that encodes a less-than/greater-than comparison across two records; this class is used for sorting operations.
- ComparisonEngine: This class contains the code that actually uses the classes that are provided in Comparison.h to perform comparisons. For example, the ne class will allow you to actually use a CNF object to see whether or not a given record has been accepted by the underlying conjunctive normal form predicate.
- Schema: This file encodes a few functions that load up a relation schema from the database catalog using the
catalog
file. - Config: A simple header file containing static information for the project. e.g Path where tbl or bin files reside.
- Const: A simple header file containing constant values used in this project. Previously Defs.h
- TwoWayList: A data structure used by
Page
to hold collection of records. - ParseTree: Contains tree structure for the CNF.
- Parser: Used to parse the CNF supplied by the user. This makes it possible for you to easily type CNF statements using the keyboard. This uses Bison library.
- Scanner: Defines rules for how to scan and what action to take for each token given in input CNF by the user. This uses Flex library. Previously Lexer.l.
- DBFile: A driver class that provides an interface for simply storing and retrieving records from the database.
- GenericDBFile: A virtual base class that is used internally by
DBFile
, to implement eitherHeap
orSorted
file functionality. - Heap: This class holds all the logic related to functions for DBFile of type
Heap
. This extendsGenericDBFile
. - Sorted: This class holds all the logic related to functions for DBFile of type
Sorted
. This extendsGenericDBFile
. - Pipe: This class works as a temporary buffer for all the records needed to be sorted. This works in conjunction with the Producer, Consumer, and BigQ Worker threads and helps in keeping synchronization among them.
- BigQ: This class does the job of sorting all the records from the input pipe according to the given sort-order, and then writing them to the output pipe.
Refer docs/ProjectDescription.pdf
for more information.