/columnar-storage-and-list-based-processing-for-graph-dbms

Code for the paper titled "Columnar Storage and List-based Processing for Graph Database Management Systems". VLDB'21

Primary LanguageJavaOtherNOASSERTION

Columnar Storage and List-based Processing in GraphflowDB

The official code repository of our VLDB 2021 paper Columnar Storage and List-based Processing for Graph Database Management Systems.

  • Research-track Paper @ VLDB 2021 [link]
  • arXiv (long-version) [link]

title


This repository contains the dataset, queries and the versions of the Graphflow system that we use in our paper.

Contents

Codebase

This repository contains 2 versions of GraphflowDB.

  1. GF-RV [link]: The baseline version that implements vanilla row-based storage and a volcano-based processor.
  2. GF-CL [link]: Version of GraphflowDB that implements our novel column-oriented storage and the List-based Processor.

Both projects follow common instructions for building and benchmarking that can be found here.

Datasets

We provide 2 datasets that we use for system comparison in the paper.

  1. IMDb 2013 dataset [link]
  2. LDBC SNB dataset (scale factor 10) [link]

Users can use their own datasets to test their own benchmark queries. Instructions for creating own dataset can be found here.

Benchmarks

We test our system on 2 leading benchmarks: JOB and LDBC SNB. Since Graphflow do not support some advanced SQL features, we modify some queries as needed. We include the exact queries that we use here in the format that can directly be run on the builts.

  1. JOB Benchmark Queries [link]
  2. LDBC SNB Interactive Complex Queries [link]
  3. LDBC SNB Interactive Short Queries [link]

Artifacts

Contact

Pranjal Gupta
Amine Mhedhbi

License

This project is licensed under the MIT License - see the LICENSE file for details.


GraphflowDB Project

Created at Data Systems Group, University of Waterloo, Canada.