ldbc/ldbc_graphalytics

Failed to import 1 graph(s): [example-directed]

ouzensang opened this issue · 4 comments

Hi,
I want to run ldbc_graphalytics for giraph. I download the provided dirver 'graphalytics-platforms-powergraph' and build it successfully. However, when I run run_bencharmk.sh, I meet the following problem. Actually, I have downloaded the data file 'example-directed' and put it under a directory. I also changed the propoerty 'graphs.root-directory' in config/benchmark.properties. However, it still cannot import this graph.

11:51 [INFO ] Initializing Benchmark Suite.
11:51 [INFO ] Loading Benchmark...
11:51 [INFO ] Imported 0 graph(s): [].
11:51 [INFO ] Failed to import 1 graph(s): [example-directed].
Exception in thread "main" java.lang.NullPointerException
at science.atlarge.graphalytics.domain.benchmark.Benchmark.verifyGraphInfo(Benchmark.java:179)
at science.atlarge.graphalytics.domain.benchmark.TestBenchmark.setupExperiments(TestBenchmark.java:108)
at science.atlarge.graphalytics.domain.benchmark.TestBenchmark.setup(TestBenchmark.java:62)
at science.atlarge.graphalytics.execution.BenchmarkLoader.parse(BenchmarkLoader.java:107)
at science.atlarge.graphalytics.BenchmarkSuite.main(BenchmarkSuite.java:72)

On the other hand, I try to output someinternal info in 'graphalytics-core', but after I modify some code in 'ldbc_graphalytics' and reinstall it. There is nothing changed. So I want to know how can I output internal info in the core repository.

Thanks a lot for your early reply!

Hi @ouzensang ,

Assuming that you are running a custom benchmark, make sure to check the following:

In config/benchmark.properties:

  • Comment out the includes for benchmarks/test.properties and benchmarks/standard.properties.
  • Make sure that graphs.root-directory refers to a directory where the example-directed.e, example-directed.v and example-directed.properties files are located.
  • If validation is enabled, make sure that graphs.validation-directory refers to a directory where the example-directed-* files are located.

As a reference, you can compare your current setup with a different platform (GraphMat or Giraph) using the Graphalytics installer.

Regarding the second problem: once you've added some output in graphalytics-core and have installed it using Maven, make sure to build the driver again as well.

Hi! @amusaafir , I download Giraph 'installer', but when I run ./bin/sh/run_banchmark.sh, I meet a problem. When running the test benchmark, the log will stay in the state '[INFO ] The benchmark runner becomes ready within 3 seconds.' , then it will not print anything. After spending 600s(timeout), the job is killed. The related information is shown in the following picture. It looks like it cannot run any algorithm. Should I install Giraph by myself?
image

I meant that you can use the installer for Giraph/GraphMat as a reference for configuring the PowerGraph benchmark. To run Giraph, you'd need Hadoop installed (and in platform.properties, you'd set the installation directory accordingly using the platform.hadoop.home property). Frankly, using the Graphalytics installer, GraphMat is easier to set up and to configure as long as you've installed the requirements mentioned in the page of the installer.

Regarding the PowerGraph setup, can you post your platform.properties and benchmark.properties file?

Hi, @amusaafir
I have installed hadoop 2.6.1 and set the installation directory. The following is my platform.properties and benchmark.properties file of Giraph installer.

platform.hadoop.home: /home/janusgraph/hadoop-2.6.1 platform.giraph.zoo-keeper-address: 192.168.2.9:2181 >platform.giraph.job.heap-size: 4096 platform.giraph.job.memory-size: 8192 platform.giraph.job.worker-count: 1 >platform.giraph.job.worker-cores: 4 platform.hadoop.hdfs.directory: /graphalytics

[Benchmark]
include = benchmarks/custom.properties
benchmark.description =
graphs.root-directory = /home/janusgraph/Flash/ldbc_Giraph/graphalytics-benchmark/datasets
graphs.cache-directory = ./cache/
graphs.validation-directory = /home/janusgraph/Flash/ldbc_Giraph/graphalytics-benchmark/datasets
graphs.output-directory = ./output/
benchmark.executor.port = 8011
benchmark.runner.port = 8012
benchmark.runner.max-memory =

include = platform.properties
include = environment.properties
include = graphs.properties
include = pricing.properties

Actually, I download Giraph 'installer' and built it. But when I run ./bin/sh/run_banchmark.sh, I get the following errors. I don't know why the error " Failed to find any yarn application ids in the driver log" happened.

10:46 [INFO ] The preparation for the benchmark succeed (if needed).
10:46 [INFO ] The benchmark runner becomes ready within 1 seconds.
10:46 [ERROR] A benchmark failure (EXE) is caught by the runner.
10:46 [ERROR] Failed to find any yarn application ids in the driver log.
10:46 [WARN ] Terminating runner process forcibly.
10:46 [WARN ] Terminating process 19574 focibly.
10:46 [WARN ] Executing command "kill -9 19574"
10:46 [ERROR] Failed to kill runner process.
10:46 [INFO ] The benchmark run is sucessfully terminated.