Crash using cosine similarity when calling search
Opened this issue · 4 comments
When indexing a small number of vectors I am getting this error when specifying cosine_similarity (euclidean works fine for instance):
thread 'hora_test' panicked at 'called `Option::unwrap()` on a `None` value', /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/core/neighbor.rs:32:54
stack backtrace:
0: rust_begin_unwind
at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panicking.rs:584:5
1: core::panicking::panic_fmt
at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/panicking.rs:143:14
2: core::panicking::panic
at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/panicking.rs:48:5
3: core::option::Option<T>::unwrap
at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/option.rs:752:21
4: <hora::core::neighbor::Neighbor<E,T> as core::cmp::Ord>::cmp
at /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/core/neighbor.rs:32:9
5: <hora::core::neighbor::Neighbor<E,T> as core::cmp::PartialOrd>::partial_cmp
at /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/core/neighbor.rs:38:14
6: core::cmp::PartialOrd::le
at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/cmp.rs:1129:19
7: core::cmp::impls::<impl core::cmp::PartialOrd<&B> for &A>::le
at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/cmp.rs:1505:13
8: alloc::collections::binary_heap::BinaryHeap<T>::sift_up
at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/alloc/src/collections/binary_heap.rs:562:16
9: alloc::collections::binary_heap::BinaryHeap<T>::push
at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/alloc/src/collections/binary_heap.rs:496:18
10: hora::index::hnsw_idx::HNSWIndex<E,T>::search_layer::{{closure}}
at /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/index/hnsw_idx.rs:363:25
11: <core::slice::iter::Iter<T> as core::iter::traits::iterator::Iterator>::for_each
at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/slice/iter/macros.rs:211:21
12: hora::index::hnsw_idx::HNSWIndex<E,T>::search_layer
at /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/index/hnsw_idx.rs:353:13
13: hora::index::hnsw_idx::HNSWIndex<E,T>::search_knn
at /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/index/hnsw_idx.rs:433:25
14: <hora::index::hnsw_idx::HNSWIndex<E,T> as hora::core::ann_index::ANNIndex<E,T>>::node_search_k
at /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/index/hnsw_idx.rs:615:55
15: hora::core::ann_index::ANNIndex::search
at /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/core/ann_index.rs:93:9
16: hora_c::hora_test
at ./src/lib.rs:192:13
17: hora_c::hora_test::{{closure}}
at ./src/lib.rs:168:1
18: core::ops::function::FnOnce::call_once
at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ops/function.rs:227:5
19: core::ops::function::FnOnce::call_once
at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
failures:
hora_test
Same. Do you have any clue?
I ended up switching to brute force search for my use case so haven't revisited it.
I had the same issue and it's caused by this commit fca4516 which negates the output of the dot product resulting in calling sqrt() of a negative number when calculating the cosine distance. I'm not sure why the change was made but reverting it fixed CosineSimilarity for me though it may break other things. You can see the change I made here: rangsikitpho@0836f2c
Thanks @rangsikitpho !
Removing negation in line 28 and 32 fixes this and top distance pairs look something like (0, -0.060707208), (3, -0.26921165), (1, -0.6891982), (2, -0.9331413)]
.
I just convert the distance.abs()
to display score.