- Datasets are listed in data folder.
- The complete benchmark of the CUDA-based transitive closure computation experiment can be executed on an Nvidia A100 GPU with a minimum of 40 GB GPU memory. The ThetaGPU single-GPU node is a suitable choice.
- Partial benchmarks can be run on other Nvidia GPUs, but they may result in program termination for certain datasets due to limited GPU memory, leading to an instance of the
std::bad_alloc: cudaErrorMemoryAllocation: out of memory
error.
- Download and install the NVIDIA CUDA Toolkit from the NVIDIA website: https://developer.nvidia.com/cuda-toolkit-archive
- Follow the installation instructions for your operating system. Make sure to install version 11.4.2 or later.
- Download and install CMake(version 3.9 or later) from the CMake website: https://cmake.org/download/
- need apply patch https://github.com/NVIDIA/thrust/pull/1832/files to fix integer overflow in
thrust::reduce
- Transitive closure computation is a fundamental operation in graph analytics and relational algebra.
- We present a CUDA-based implementation of transitive closure computation that is optimized for sparse graphs.
- Build and run instructions are provided below:
cmake --no-warn-unused-cli -DCMAKE_BUILD_TYPE:STRING=RelWithDebInfo -DCMAKE_EXPORT_COMPILE_COMMANDS:BOOL=TRUE -S./ -B./build
cd build
make
This will build the TC
executable using the nvcc compiler.
- The
TC
executable takes a single argument, which is the path to the input file containing the graph data. The input file should be in the following format:
./TC ../data/data_5.txt
- Run using Interactive node:
ssh <USERNAME>@polaris.alcf.anl.gov
qsub -I -l select=1 -l filesystems=home:eagle -l walltime=1:00:00 -q debug -A dist_relational_alg
module load gcc
cd slog-gpu-backend
git fetch
git reset --hard origin/main
rm -rf build
module purge
module load gcc
module load cmake
module load cudatoolkit-standalone
cmake --no-warn-unused-cli -DCMAKE_BUILD_TYPE:STRING=RelWithDebInfo -DCMAKE_EXPORT_COMPILE_COMMANDS:BOOL=TRUE -S./ -B./build
cd build
make
./TC ../data/data_5.txt
- Transfer a file from local machine to Polaris:
scp data_68993773.txt arsho@polaris.alcf.anl.gov:/home/arsho/slog-gpu-backend/data/
- After creating the build folder and
TC
executable, run the following commands to check for memory leaks and errors:
cuda-memcheck ./TC ../data/data_7035.txt
========= CUDA-MEMCHECK
...
TC time: 48.691
========= ERROR SUMMARY: 0 errors
compute-sanitizer ./TC ../data/data_7035.txt
========= COMPUTE-SANITIZER
...
TC time: 0.668892
========= ERROR SUMMARY: 0 errors
ssh <USERNAME>@polaris.alcf.anl.gov
qsub -I -l select=1 -l filesystems=home:grand:eagle -l walltime=1:00:00 -q debug -A dist_relational_alg
module purge
module load conda/2023-10-04
conda activate
pip install --extra-index-url https://pypi.nvidia.com cudf-cu11
python test/cuDF/sg.py
(2022-09-08/base) arsho::x3004c0s7b0n0 { ~/slog-gpu-backend/test/cuDF }-> python sg.py
| Dataset | Number of rows | SG size | Iterations | Time (s) |
| --- | --- | --- | --- | --- |
| hipc | 5 | 4 | 3 | 0.016371 |
Error in fe_body. Message: std::bad_alloc: out_of_memory: CUDA error at: /__w/rmm/rmm/include/rmm/mr/device/cuda_memory_resource.hpp:70: cudaErrorMemoryAllocation out of memory
Error in loc-Brightkite. Message: std::bad_alloc: out_of_memory: CUDA error at: /__w/rmm/rmm/include/rmm/mr/device/cuda_memory_resource.hpp:70: cudaErrorMemoryAllocation out of memory
Error in fe_sphere. Message: std::bad_alloc: out_of_memory: CUDA error at: /__w/rmm/rmm/include/rmm/mr/device/cuda_memory_resource.hpp:70: cudaErrorMemoryAllocation out of memory
| CA-HepTh | 51971 | 74618689 | 9 | 21.241212 |
| Dataset | Number of rows | SG size | Iterations | Time (s) |
| --- | --- | --- | --- | --- |
| ego-Facebook | 88234 | 15018986 | 13 | 19.074940 |
| wiki-Vote | 103689 | 5376338 | 4 | 2.603751 |
| luxembourg_osm | 119666 | 245221 | 326 | 2.215113 |
| cti | 48232 | 14503742 | 44 | 3.857438 |
| fe_ocean | 409593 | 65941441 | 77 | 45.979235 |
| wing | 121544 | 647999 | 8 | 0.204277 |
| delaunay_n16 | 196575 | 25994011 | 85 | 14.832548 |
Error in usroads. Message: std::bad_alloc: out_of_memory: CUDA error at: /__w/rmm/rmm/include/rmm/mr/device/cuda_memory_resource.hpp:70: cudaErrorMemoryAllocation out of memory
Error in p2p-Gnutella31. Message: std::bad_alloc: out_of_memory: CUDA error at: /__w/rmm/rmm/include/rmm/mr/device/cuda_memory_resource.hpp:70: cudaErrorMemoryAllocation out of memory
| p2p-Gnutella09 | 26013 | 62056583 | 14 | 13.705286 |
| p2p-Gnutella04 | 39994 | 116931333 | 18 | 48.947088 |
| cal.cedge | 21693 | 23519 | 58 | 0.259069 |
| TG.cedge | 23874 | 608090 | 54 | 0.719743 |
| OL.cedge | 7035 | 285431 | 56 | 0.385674 |
a TC example
Relation *edge_2__2_1 = new Relation();
Relation *path_2__1_2 = new Relation();
load_relation(path_2__1_2, "path_2__1_2", 2, raw_graph_data,
graph_edge_counts, 1, 0, grid_size, block_size);
load_relation(edge_2__2_1, "edge_2__2_1", 2, raw_reverse_graph_data,
graph_edge_counts, 1, 0, grid_size, block_size);
LIE tc_scc(grid_size, block_size);
tc_scc.add_relations(edge_2__2_1, true);
tc_scc.add_relations(path_2__1_2, false);
float join_detail[10] = {0, 0, 0, 0, 0, 0, 0, 0, 0};
tuple_generator_hook reorder_path_host;
cudaMemcpyFromSymbol(&reorder_path_host, reorder_path_device,
sizeof(tuple_generator_hook));
tuple_copy_hook cp_1_host;
cudaMemcpyFromSymbol(&cp_1_host, cp_1_device, sizeof(tuple_copy_hook));
tc_scc.add_ra(RelationalJoin(edge_2__2_1, FULL, path_2__1_2, DELTA,
path_2__1_2, reorder_path_host, nullptr,
LEFT, grid_size, block_size, join_detail));
tc_scc.fixpoint_loop();