Compute geodesic distance between node a set of node pairs efficiently
Closed this issue ยท 11 comments
I am trying to compute the geodesic distance between a set of node pairs and I am running into a few problems here. navis.dist_between(...)
only works if the source and the root node id are single integer values but throws an error if source and root ids are lists like shown below
distances = navis.dist_between(tree_neuron, pairs[:, 0], pairs[:, 1])
# throws ValueError: Can only process single nodes/vertices. Use navis.geodesic_matrix instead.
If I use navis.geodesic_matrix
i run into issues with very long computation times since my neurons have skeletons with many nodes and thus the resulting matrix is really large and takes forever to compute.
Do you have some guidance on what's the most efficient way to compute geodesic distances between only a selected set of node pairs rather than the whole distance matrix?
hi @jakobtroidl! in my experience https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csgraph.dijkstra.html#scipy.sparse.csgraph.dijkstra or https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csgraph.shortest_path.html are quite fast for this kind of thing. Would just need to get the adjacency matrix describing the skeleton and plug that in as a csr_array
. The indices
argument lets you select only a subset of indices to compute distances from - while it will give you back distances to all target nodes, you can just subset the target indices you need and it is typically still quite fast. You will probably also want directed=False
here.
Is this on skeletons?
Thanks @bdpedigo. Yes, this is on skeletons @schlegelp.
NVM, your example already suggests it's skeletons. Do you have navis-fastcore
installed? If so (and your Navis is >=1.7.0
), geodesic_matrix
should be blazingly fast.
If you wanted to, you could also use navis-fastcore
's own geodesic_matrix() function directly as Navis doesn't wrap it's full functionality.
If that's still not cutting it, then we can think about implementing a dedicated function for distances between pairs in fastcore
.
Thanks @schlegelp! Using navis-fastcore
seems to give me some significant gains. It reduces runtime from 5s to about 500ms for a ~7,500 node skeleton. I just ran pip install navis-fastcore
and the navis.geodesic_matrix
function became significantly faster. I found the documentation useful.
Glad it worked! I would have expected more than a 10x speed-up to be honest...
Hi @jakobtroidl. I added a new function to navis_fastcore
for you: update to version 0.0.6
and check out geodesic_pairs()
.
It looks to be pretty fast but I did only very limited testing on my end and there might be situations where computing a matrix with a subset of source/targets would end up being faster after all. Let me know what you think.
Wow, that's fantastic. Thanks so much @schlegelp. I'll test it.
^^ so far the geodesic_pairs
function works really well for me. I am bit confused about how to use the weights
argument. Is my assumption correct that for the child -> parent relationship of the root node, I should just use a weight of zero? how are you doing that in your examples ?
Yes, use a weight of 0
. You can use navis_fastcore.dag.parent_dist
with root_dist=0
to get the weights.
That works perfectly. Here's some quick example code for future reference:
import navis
import navis_fastcore as fastcore
neuron = navis.example_neurons(1)
node_ids = neuron.nodes.node_id.values
parent_ids = neuron.nodes.parent_id.values
coords = neuron.nodes[["x", "y", "z"]].values
weights = fastcore.dag.parent_dist(node_ids, parent_ids, coords, root_dist=0)
matrix = fastcore.geodesic_matrix(node_ids, parent_ids, weights=weights)