ldbc/ldbc_graphalytics

Are all but one graphs undirected?

whatsthecraic opened this issue · 2 comments

Dears folks,
in the datasets page, as most graphs are created from either graph500 or the ldbc datagen, they are undirected. The only directed graph is twitter, but it's labelled with "Incorrect Vertex Count". Can you elaborate whether more directed graphs will be included and what are the consequences of using the twitter graph in benchmark?

Correct - the bulk of the datasets are undirected. There is currently no plan to add more directed graphs. However, you can always use your own directed datasets.
The Twitter dataset should work fine (including validation). IIRC, the number of vertices specified in the properties file is different than the actual number (52,579,678). I'll add this explicitly on the site and will update the actual properties file once I have access to change this.

Here's a "colour-coded" spreadsheet with the datasets: https://docs.google.com/spreadsheets/d/e/2PACX-1vTuNlJDe521a21Conz5IjPiDwVvd0Rrd_WwJ7aUkszABDi9lvIwezesvzn2S2XblELa7wUAtisbu-jF/pubhtml

Other than Twitter and the example graph, the cit-Patents and the wiki-Talk data sets are directed.