Compute geodesic distance between node a set of node pairs efficiently

Question

Compute geodesic distance between node a set of node pairs efficiently

Closed this issue 4 months ago · 11 comments

I am trying to compute the geodesic distance between a set of node pairs and I am running into a few problems here. navis.dist_between(...) only works if the source and the root node id are single integer values but throws an error if source and root ids are lists like shown below

distances = navis.dist_between(tree_neuron, pairs[:, 0], pairs[:, 1])
# throws ValueError: Can only process single nodes/vertices. Use navis.geodesic_matrix instead.

If I use navis.geodesic_matrix i run into issues with very long computation times since my neurons have skeletons with many nodes and thus the resulting matrix is really large and takes forever to compute.

Do you have some guidance on what's the most efficient way to compute geodesic distances between only a selected set of node pairs rather than the whole distance matrix?

Answer 1 · 2024-09-11T15:39:47.000Z

hi @jakobtroidl! in my experience https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csgraph.dijkstra.html#scipy.sparse.csgraph.dijkstra or https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csgraph.shortest_path.html are quite fast for this kind of thing. Would just need to get the adjacency matrix describing the skeleton and plug that in as a csr_array. The indices argument lets you select only a subset of indices to compute distances from - while it will give you back distances to all target nodes, you can just subset the target indices you need and it is typically still quite fast. You will probably also want directed=False here.

Answer 2 · 2024-09-11T17:17:03.000Z

Is this on skeletons?

Answer 3 · 2024-09-11T17:18:40.000Z

Thanks @bdpedigo. Yes, this is on skeletons @schlegelp.

Answer 4 · 2024-09-11T17:24:42.000Z

NVM, your example already suggests it's skeletons. Do you have navis-fastcore installed? If so (and your Navis is >=1.7.0), geodesic_matrix should be blazingly fast.

If you wanted to, you could also use navis-fastcore's own geodesic_matrix() function directly as Navis doesn't wrap it's full functionality.

If that's still not cutting it, then we can think about implementing a dedicated function for distances between pairs in fastcore.

Answer 5 · 2024-09-11T18:41:15.000Z

Thanks @schlegelp! Using navis-fastcore seems to give me some significant gains. It reduces runtime from 5s to about 500ms for a ~7,500 node skeleton. I just ran pip install navis-fastcore and the navis.geodesic_matrix function became significantly faster. I found the documentation useful.

Answer 6 · 2024-09-11T19:34:40.000Z

Glad it worked! I would have expected more than a 10x speed-up to be honest...

Answer 7 · 2024-09-16T15:40:44.000Z

Hi @jakobtroidl. I added a new function to navis_fastcore for you: update to version 0.0.6 and check out geodesic_pairs().

It looks to be pretty fast but I did only very limited testing on my end and there might be situations where computing a matrix with a subset of source/targets would end up being faster after all. Let me know what you think.

Answer 8 · 2024-09-16T22:04:22.000Z

Wow, that's fantastic. Thanks so much @schlegelp. I'll test it.

Answer 9 · 2024-09-19T22:44:06.000Z

^^ so far the geodesic_pairs function works really well for me. I am bit confused about how to use the weights argument. Is my assumption correct that for the child -> parent relationship of the root node, I should just use a weight of zero? how are you doing that in your examples ?

Answer 10 · 2024-09-20T08:43:46.000Z

Yes, use a weight of 0. You can use navis_fastcore.dag.parent_dist with root_dist=0 to get the weights.

Answer 11 · 2024-09-20T14:10:54.000Z

That works perfectly. Here's some quick example code for future reference:

import navis
import navis_fastcore as fastcore

neuron = navis.example_neurons(1)

node_ids = neuron.nodes.node_id.values
parent_ids = neuron.nodes.parent_id.values
coords = neuron.nodes[["x", "y", "z"]].values

weights = fastcore.dag.parent_dist(node_ids, parent_ids, coords, root_dist=0)
matrix = fastcore.geodesic_matrix(node_ids, parent_ids, weights=weights)