timescale/pgvectorscale

Feature Request: Accesses underlying clusters/groups

Opened this issue · 2 comments

If I understand the idea behind DiskANN (I may be completely misunderstanding it), it performs clustering for free as a result of building an index (like HNSW). It would be an amazing feature to be able to get each vector's "cluster". This would be really useful for entity resolution / de-duplication / blocking, without having to query for every point in the database.

< 2000, so it has to be 1999 or less

Unfortunately, DiskANN doesn't do clustering so we can't access it. But, we'll consider adding clustering functions in the future.