PhylogeneticDistanceMatrix.distances: return symmetric distances
nick-youngblut opened this issue · 4 comments
It would be helpful if the user could select PhylogeneticDistanceMatrix.distances(full=True)
in order to get back a vector or matrix of symmetric distances instead of just the lower triangle (lacking the diagonal)
Right now, this method returns a list of distances.
If implemented, you would want it to return the concatenation of this list and [0] * n
, where n = number of taxa?
I guess that the user can just make the symmetric matrix via:
taxa = t.taxon_namespace
np.array([pdc(t1,t2) for t2 in taxa for t1 in taxa]).reshape(len(taxa), len(taxa))
...but it would be nice to have a simpler method. At least for me, I wanted a symmetric matrix (as shown above) that I could feed to scikit-learn for clustering.
Fair enough.
But again, what would be the expected return value of this method with this option (given that DendroPy does not require or use NumPy)?
hmm... without the numpy requirement, the user would have to convert to an array, such as via:
numpy.array([numpy.array(xi) for xi in x])
...which is nearly as much work as:
taxa = t.taxon_namespace
np.array([pdc(t1,t2) for t2 in taxa for t1 in taxa]).reshape(len(taxa), len(taxa))
...so maybe such a feature would not actually be that helpful