hanabi1224/RuAnnoy

Wasm/JS build?

josephrocca opened this issue ยท 7 comments

Hey, I was just looking into getting Annoy running in the browser and came across your repo. Wondering if you've considered adding a Wasm build? Not sure how much work it'd be, but the first issue I ran into (as a wasm-pack and Rust newbie) was that, IIUC, wasm-pack doesn't have an "out of the box" virtual filesystem that a index file could be loaded into. But if that hurdle can be passed (e.g. with a new AnnoyIndex::load_from_buffer() function?), then perhaps a Wasm build would be quite easy?

No worries if this isn't at all on the road map - just thought I'd ask to see what you think.

I'm open to extending this to support wasm by loading small indices into memory. PRs are welcome and maybe I can do that with my spare time.

Hey @josephrocca
Wasm support has been added. I've also added an example site to play with.
deployment: https://annoy-web-demo.vercel.app/
source code: https://github.com/hanabi1224/RuAnnoy/tree/master/example/web

@hanabi1224 Awesome! Just tested it with some random data - searches though 100k 512d vectors in 1.6ms! Was taking on the order of ~1s with my very naive "compare every vector against every other vector" approach ๐Ÿ˜…

Would you be able to publish it to npm or deno.land/x so it's usable with a simple import statement? E.g.something like:

import init, { load_index, IndexType } from "https://cdn.jsdelivr.net/npm/annoy-js@0.0.1/lib/annoy.js";
await init(); // (i.e. wasm file is loaded from jsdelivr too)
...

Not sure about npm, but with deno.land/x it's as simple as setting up a web hook in your repo and then publishing a tag (no account needed, or anything like that): https://deno.land/add_module

(Aside: Is index-building support on the roadmap?)

searches though 100k 512d vectors in 1.6ms

Glad the performance is OK. I have not benchmarked the wasm version yet. simd128 might further speed it up.
With x64 binaries, the benchmark shows ~2.5-3x timecost of c++ impl without simd and ~1-1.5x of c++ impl using avx2

Would you be able to publish it to npm or deno.land/x

Yeah, but I'm not quite familiar with either npm or deno, especially when it contains wasm. I will need more time to investigate on this, PRs are welcome if u r familiar with it.

P.S. When you run wasm-pack build it creates an npm package locally under pkg/. I guess u can link it locally just like how I built the demo for now.

Is index-building support on the roadmap?

Not really, I don't see too much value in it, since the original python/c++ implementation is convenient and fast. Also, there's spark implementation to build the index distributedly.

Looks like nothing needs to be changed - I just forked this repo, and added the annoy.js and annoy_bg.wasm to a new dist/web folder, and then added the web hook per this page, then this works in Deno and the browser (and should work in Node.js once all the HTTP import stuff stabilises):

import init, { load_index, IndexType } from "https://deno.land/x/annoy_js_test@v0.1.5/annoy.js";
await init();

Turns out init has some logic in it that auto-fetches the wasm file (assumed to be located beside the annoy.js file) if nothing it passed in as the first argument - I'm assuming this is autogenerated glue code.

Not really, I don't see too much value in it

There would certainly be value in it! Having to build the index with another language introduces some significant DevX friction - most JS developers aren't familiar with other languages. Luckily in my case I know a bit of Python and can use Google Colab to avoid the somehow-always-painful environment setup, but it's certainly not a great user experience even in my case, especially if I want to regularly re-build the index (I have a potential use-case on a Node server where I'd ideally like to re-build the index each time I restart the server).

That said, I'm sure it's quite a bit of work to get the index building ported, and you've already made done created a really useful thing here, so I completely understand if it's not on the roadmap due to the effort required!

Hey @josephrocca
I have published it as an npm package at https://www.npmjs.com/package/@hanabi1224/annoy-rs I guess this is sufficient for your use case right? If so I would rather not maintain another deno package

Yep! Works perfectly if I just import this URL: https://cdn.jsdelivr.net/npm/@hanabi1224/annoy-rs@0.1.0/annoy.js Thanks a lot!