jewettaij/superpose3d

to determine which pairs of points from either cloud correspond

ChenShengsGitHub opened this issue · 3 comments

This function does not attempt to determine which pairs of points from either cloud correspond. Instead, it infers them from the order of the arrays.
I wonder if there exisits methods that determine which pairs of points from either cloud correspond, if true, could you please give me a link?

Hi Chen Sheng

In general, there are several approaches, like ICP which work on general 3D point clouds (without ordering). If my understanding is correct, these algorithms are not exhaustive or deterministic, and they are not guaranteed to pick the optimal pairs of points from either cloud. But they aim to do a reasonably good job. I am not an expert on this topic, but after a quick google search, I noticed a few links which might be helpful:

https://en.wikipedia.org/wiki/Point_set_registration
https://en.wikipedia.org/wiki/Iterative_closest_point
https://arxiv.org/abs/2001.07715

Forgive the self-promotion below. I don't know how relevant this to most people's interests...

When comparing the shapes of two proteins, you can do better. People are sometimes interested to use protein shapes as a tool to infer which amino acids from a group of proteins were inherited from a common ancestor. (Because protein shape evolves more slowly than sequence.) This is a bioinformatics problem. Some strategies to solve this problem will rotate and superimpose the two protein structures together and pick nearby pairs of points (one from either protein). I wrote one of these programs, although the source code hosted there no longer compiles with modern compilers. (I no longer work there but I can email anyone who wants it with updated code.) Of course, by now there are probably better algorithms. Since I left that field, at least one or two newer algorithms have been published which may perform better on some kinds of proteins. But all of these protein-structure-comparison algorithms that I know of exploit the fact that the cloud of points in a protein is ordered according to the protein's sequence (with typically one point per amino acid). So successive points (amino acids) from one protein are always matched with successive amino acids from the other protein (because insertion or deletion mutations preserve the order of the remaining amino acids). This makes it possible to perform an exhaustive, optimal search for matched pairs (unlike general strategies like ICP).

I hope this reply is not too late to be useful.
-Andrew