hanabi1224/RuAnnoy

[feature request] building index on the client

josephrocca opened this issue · 2 comments

Hello again! I've come across the need again to build an index on the client, and am wondering if there has been any reconsideration of this as a feature?

My particular use case is for implementing a retrieval/memory system for OpenCharacters - it's a free, fully local/client-side application (other than calls which go directly to OpenAI), so there's no server to handle index creation.

It looks like Voy may be an option for this once it stabilises a bit, but for now I haven't been able to get it working.

Seems that currently Voy and RuAnnoy are the only two options in this space - will benchmark once I manage to get Voy working. But for RuAnnoy to be practically useful on the client for many cases, I think it probably needs ability to create/update indices.

If this is completely out of the question, feel free to close this. Thanks!

I'm open to adding indexing support to this library but unfortunately, I have no bandwidth, PRs are welcome.
I took a look at OpenCharacters, it's pretty cool. If I understand correctly, the index is built with sentences, in this case, vector search might not be the best solution. NLP based search like lucene, elastic search would be a better solution. You can look for some lucene alternatives in rust ecosystems that can be compiled into wasm.

@hanabi1224 No worries - thanks for the reply! In my case I'm embedding text with either a local model (e.g. using web-ai) or OpenAI's embedding API (the user must enter their OpenAI API key), so I do indeed need something like RuAnnoy rather than keyword search. This isn't for the simple user "message/chat search" functionality - it's for long-term character memory/retrieval - i.e. so a character can try to recall memories that are related to what it's currently talking about.

Looks like we may get a wasm port of hnswlib soon so that might be the best option for now.