Using pyTorch 0.1.12 and Python 2.7.
Given a weighted adjacency matrix of a connected finite tree, calculate the optimal embedding in Euclidean space and Hyperbolic space using Stochastic Gradient Descent(SGD).
NOTE: Since the loss function w.r.t. embedding vectors is not convex, SGD does not guarantee convergence to global minima! Run a couple of times until you have obtain the lowest loss score.
TODO:
-
add images
-
find new method to guarantee convergence to global minima
Given a connected finite tree that looks like this:
We can have a lot of different visualizations for this tree, since we dont care about the angles and real length of edges. For example:
The two trees are equivalent! So a better representation of this tree is to use adjacency matrix. Let each edge to have a tree length 1, then the weighted tree adjacency matrix is:
To find an optimal embedding of this tree in Euclidean space, we transform the problem into an optimization problem so we can solve (approximate might be a better word) with SGD.
loss = L2_loss(space_distance_matrix, tree_distance_matrix)
And so now we minimise loss w.r.t. embedding vectors of vertices using SGD. (note: space_distance depends on embedding vectors, tree_distance stays constant)
We then have:
with loss = 7.166, and the space distance matrix look like:
Using Riemannian SGD, we can also find the optimal embedding in Hyperbolic space. We then have:
(plotted using GeoGebra worksheet)
with loss = 1.928, and the space distance matrix look like:
This shows that embedding trees in Euclidean space is not as efficient as in Hyperbolic space, or in other words its easier to approximate trees in Hyperbolic space.