The project failed when executing pread
Opened this issue · 1 comments
Hello, we use smartssd to run the project in the environment of vitis2021.2 and xrt2021.2, but the project fails when performing pread. The code execution process is as follows:
root@cmmhc-PowerEdge-R750:/home/xuekun/code/GNN/damon24-gnn-in-situ-sampling-main/smartssd/sampling/build# ./test_streaming_sampler
Edge file handler: 4
Chunk offsets size: 29
Target nodes size: 0
Edge chunk size: 134217728
Edge file size: 6907023872
Xrt Device Id: 0x55c01cd2f340
Begin sample next epoch... bo_index: 0
cur_frontier size: 0
Prepair frontiers Duration: 0.002423 ms
Begin sample one layer, n_neighbors: 20, number of chunks: 1
Processing chunk 0, chunk frontier size: 0
ERR: pread failed: error: Bad address
We use the papers100M dataset. The configuration of sampler in test_streaming_sample.cpp is as follows:
StreamingSampler sampler(
{0}, "parallel_streaming_sampler.xclbin", "parallel_streaming_sampler",
"/mnt/nvme2n1/test_data/papers100M/preprocessed2/streaming_edges.bin",
"/mnt/nvme2n1/test_data/papers100M/preprocessed2/chunks.txt",
"/mnt/nvme2n1/test_data/papers100M/train_nodes.bin", {20, 15, 10},
(size_t)128 * 1024 * 1024);
(Because we didn't find sample_target_nodes.xclbin in the project, we used parallel_streaming_sampler.xclbin)
And then we perform training for one epochs:
{
EasyTimer timer("Sampling the whole epoch. ");
sampler.newEpochStart();
}
Can you help us identify where the problem is? Thank you very much!
Hi, thanks for running our program. You can use parallel_streaming_sampler.xclbin directly, but I think the issue might be with the input files. Our program requires you to preprocess the papers100M dataset before running it.
Here’s what you need to do:
- Download the papers100M dataset and unzip it.
- Use the preprocessing script from this link to run the preprocessing. https://github.com/CASP-Systems-BU/damon24-gnn-in-situ-sampling/blob/main/smartssd/sampling/scripts/preprocess/papers100m.ipynb
- Modify the parameters in the following code snippet to match the location of your preprocessed files:
StreamingSampler sampler(
{0}, "parallel_streaming_sampler.xclbin", "parallel_streaming_sampler",
"/mnt/nvme2n1/test_data/papers100M/preprocessed2/streaming_edges.bin",
"/mnt/nvme2n1/test_data/papers100M/preprocessed2/chunks.txt",
"/mnt/nvme2n1/test_data/papers100M/train_nodes.bin", {20, 15, 10},
(size_t)128 * 1024 * 1024);
Thanks for bringing this issue to our attention. If you have any further questions, please let me know.