RQs_in_Regex_Graphs: A C++ repository from Abhipanda4

Regular Path Queries Using Monte Carlo Framework

This work tries to optimize regular expression queries based on a data graph by calculating a partial index using Monte carlo sampling.

A data graph is a collection of nodes and edges. For this work, the synthetic graphs considered have the following characterstics:

The queries are specified by using node predicates and regular expressions:

A node predicate can be specified by an ops string like:
- ____==F<=81<=158__
- The possible ops for strings are: == & !=
- The possible ops for integers are: ==, >, >=, <, <=, !=
- In case we don't care about some attribute, it is specified by __(double underscore)
The regular expression is a string formed from the grammar S -> c | c<=k | c+; where c is a color and k is any integer.

Go into the src folder. There are 3 cpp files
- partial_indexing_BFS.cpp : This is responsible for running monte-carlo simulations to calculate a partial index and then answer Regular path Queries using this index.
- trans_closure.cpp : This calculates a full transitive closure for the data graph and uses it to answer the queries.
- BFS_efficient.cpp : This implemenatation uses simple BFS algorithm to find possible paths.
To compile any source file, use command:
```
g++ -std=c++11 <source-file-name>
```
Optionally, you can use a -o flag with g++ to give the binary a custom name
```
g++ -std=c++11 <source-file-name> -o <custom-name>
```
To finally run the program, use:
```
./a.out <target-graph-file>
```
where target-graph-file is the graph file on which queries are to be performed. A sample file has been provided in data_files folder, named graph_15000.txt which has 15000 nodes and 45000 edges.