Visualize non-Euclidean embeddings
Closed this issue · 3 comments
I'm trying to visualize 128 dimensional embeddings in hyperbolic space using Atlas. However, I noticed that Atlas's create_index
function includes the line
'nearest_neighbor_index_hyperparameters': json.dumps({'space': 'l2', 'ef_construction': 100, 'M': 16})
in the build_template
here. This line uses the l2
distance while I have to use the hyperbolic distance function as my embeddings are in hyperbolic space. I know your dimensionality reduction algorithm is closed source, but I was wondering
- Is there a way to specify a custom distance function when creating embeddings? This could be a helpful feature for users looking for more customization.
- Is the
l2
space used when performing dimensionality reduction or is it perhaps only used for nearest neighbor search?
Any answers to this would be greatly appreciated. Thanks!
There is not currently a way to specify a custom distance function and there most likely won't be one in the short term--but it couldn't hurt to just try the high-d hyperbolic space and see if it looks OK?
I tried visualizing 128 dimensional points on the Poincare model of hyperbolic geometry. The data possesses some structure but not a ton. It is unclear if this is because the dimensionality reduction violates the assumptions of the manifold or if the data is just inherently noisy. So to answer your question, the data does look OK but that is not entirely satisfactory for me.
Also, I just wanted to clarify: does the dimensionality reduction algorithm assume we are in Euclidean space? Or is it just the nearest neighbor search which assumes we are in Euclidean space?
Thanks!
Yes, the dimensionality reduction assumes a kernel that is Euclidean