enjalot/latent-scope

Support aligning UMAPS

enjalot opened this issue · 2 comments

We should be able to compare embeddings more directly by trying to align our UMAP projections:
https://umap-learn.readthedocs.io/en/latest/aligned_umap_basic_usage.html

I'm interested to find out if just initializing with the projection of one UMAP would be enough to get improved results.

Then it would also be interesting to try to run AlignedUMAP with a user defined selection of embeddings (e.g. OpenAI + Voyage + Nomic).

A script that takes in a list of embedding_ids and outputs a new umap would be a great place to start. The fact that its based on multiple embeddings is fine as far as any downstream visuals or tasks are concerned.

I've pushed code to main that adds an --align option to the umapper script:
ls-umap datavis-misunderstood embedding-002 25 0.1 --align="embedding-006,embedding-007,embedding-008"

It generates an aligned umap for each embedding in the list, as well as the usual one specified.
I still want to figure out a way to incorporate it into the UI, perhaps have a button that shows all the available embeddings as checkboxes if you want to align the current embedding you are going to umap.

It also makes me want an explore interface that is focused on a group of aligned umaps where you could easily switch between them and somehow highlight the differences (see the points that move the most between two umaps).

The UI has now been implemented in Setup and is included in 0.1.2 release