How scalable is this?
Opened this issue · 1 comments
uriva commented
- Would it support a chunk of 5 billion nodes/edges?
- If each has minimal payload, how much time would the process take?
jeffreylovitz commented
Hi @uriva,
- If your server has enough RAM to store and query a graph with 5 billion entities, you should not have an issue running the bulk loader. It will automatically divide your input into batches to populate a buffer of up to 2 gigabytes, and maintains a dictionary mapping all nodes to their identifiers.
- I'd expect this to take dozens of hours, but there are too many factors in play to be very precise. Generally, load time will scale linearly with the input size. Building a graph with about 5 million nodes, 5 million edges, and 20 million properties on my system takes 220 seconds, so increasing that by a factor of 500 gives about 30 hours as a very very rough estimate.