The scripts for the gathering and analysis of pmotif data. Based on pmotif_lib
.
Install the python dependencies:
pip install -r requirements
The pmotif_lib
dependencies requires a gTrieScanner executable to work.
Download the sources, compile them, and add the executable to your path.
Furthermore, pmotif_lib
expects certain environment vars to be set. It is recommended to create a pmotif_lib.env
file, which can be used to export those variables:
export DATASET_DIRECTORY=/path/where/edgelists/are/located
export EXPERIMENT_OUT=/path/where/raw/graphlets/and/pmetric/should/go
export GTRIESCANNER_EXECUTABLE=/path/to/the/gtriescanner/executable
export WORKERS=1
To run a pmotif detection pipeline, run the pmotif_detection.py
script:
source pmotif_lib.env # Export the env vars required by pmotif_lib
python3 pmotif_detection.py --edgelist_name your_edgelist --graphlet_size [3,4] --random_graphs [-1, n]
--edgelist_name
specifies the location of a graph edgeslist, relative toDATASET_DIRECTORY
.--graphlet_size
specifies the graphlet size, and is limited to either 3, or 4.--random_graphs
specifies the number of random graphs to generate.- If set to a number
>= 0
, it will generate as many random graphs, and will fail if random graphs are already present. - If set to
-1
, it will not generate new random graphs, but reuse random graphs which are already present. Use this if you already ran with a different graphlet size and want to use the same random graphs. Example:
- If set to a number
source pmotif_lib.env # Export the env vars required by pmotif_lib
python3 pmotif_detection.py --edgelist_name karate_club.edgelist --graphlet_size 3 --random_graphs 10
This will create a new directory in EXPERIMENT_OUT
, named after the edgelist. It will contain the graphlets and pmetrics of the original graphs,
the random graphs, as well as the graphlets and pmetrics for each of the random graphs.
The pmotif_detection_benchmark.py
is essentially the same script with 3 differences:
- It removes the output after it ran
- It creates a log file with runtimes for sub-steps of the pmotif detection pipeline
- It takes the additional argument
--benchmarking_run
, specifying a number prefixed to the aforementioned log file
Requires the output of Pmotif Detection to run!
To create raw analysis data from pmotif data, run the create_analysis_data.py
script:
source pmotif_lib.env # Export the env vars required by pmotif_lib
python3 create_analysis_data.py --analysis_out outpath --edgelist_name your_edgelist --graphlet_size [3,4]
This script expects the same env vars as pmotif_detection.py
, make sure not to change the variables or the location on disk!
--edgelist_name
and--graphlet_size
have the same function as withpmotif_detection.py
, and are used to identify the correct pmotif data to process.--analysis_out
specifies the output path of the analysis data. Example:
source pmotif_lib.env # Export the env vars required by pmotif_lib
python3 create_analysis_data.py --analysis_out ./out --edgelist_name karate_club.edgelist --graphlet_size 3 --random_graphs 10
Requires the output of Analysis Data Creation to run!
Finally, this script transforms the analysis data into a html analysis report. As it is independent of the pmotif_lib
,
you do not need the env vars anymore.
python3 -m report_creation.analyse_result --analysis_out path/to/analysis/data --edgelist_name your_edgelist --graphlet_size [3,4]
--edgelist_name
and--graphlet_size
have the same function as withcreate_analysis_data.py
, and are used to identify the correct analysis data to process.--analysis_out
specifies the location of the analysis data (same as increate_analysis_data.py
) and also serves as output path for this script. Example:
python3 -m report_creation.analyse_result --analysis_out ./out --edgelist_name karate_club.edgelist --graphlet_size 3
dataset/
Placeholder for datasetsout/
Placeholder for script outputrun_scripts/
Bash scripts wrappers with logging to run the python scripts on a clusterslurm_helper/
Generators of slurm batch files to run batch jobsconversion/
Utility to transform oldpmotif_lib
data format into new formatnotebooks/
Notebooks for prototyping