How to align and get the begin and end time of the query in database？

Question

How to align and get the begin and end time of the query in database？

980202006 opened this issue 3 years ago · 4 comments

980202006 commented 3 years ago

If I query the start and end positions of an audio, through faiss.search, it is possible that top n is a non-continuous value, how can I align it?

Answer 1 · 2021-09-06T11:37:22.000Z

The simplest answer is matcher.

Answer 2 · 2021-09-06T14:56:47.000Z

@980202006 For the details, here is one example.
We assume a Query as a sequence of 3 contiguous segments.

Reference DB: [z0, z1, z2, z3,...,z10, z11, z12,...z20]
Query: [z1', z2', z3']

The top N (e.g. N=5) results of faiss.search for the query, for example:

I(z1') = [1, 2, 11, 3, 15] # sorted top-5 result indices
D(z1') = [0.1, 0.2, 0.3, 0.4, 0.5] # distances are not required in this stage.

I(z2') = [11, 2, 1, 3, 12]
D(z2') = ...
 
I(z3') = [3, 2, 12, 4, 11]
D(z3') = ...

Here the result [11,2,1,3,12] of the second segment's result I(z2') implies that the sequence would start from '[10,1,0,2,11]. Based on this idea, we perform offset-compensation` for each segment-wise search result.

# time-offset compensation
I'(z1) = I(z1)  # [1, 2, 11, 3, 15]
I'(z2') = I(z2') - 1 # [10, 1, 0, 2, 11]
I'(z3') = I(z3') - 2 # [1, 0, 10, 2, 9]

We then perform np.unique to reduce duplicated candidates

# unique
c_start_index = unique_set(all_I's) # [0, 1, 2, 3, 9, 10, 11, 15]
# from the `c_start_index`, we get the final candidates, c
c = [[0,1,2], [1,2,3], [2,3,4], [9,10,11], [10,11,12], [11,12,13], [15,16,17]]

We calculate the total distance (called score, but actually a penalty: smaller value is better) for each candidate sequence:

# d is distance
score(c[0]) = score([0,1,2]) = d(z1', z0) +d(z2', z1) + d(z3',z2)
score(c[1]) = score([1,2,3]) = d(z1', z1) + d(z2', z2) + d(z3', z3)
...
score(c[7]) = score([15,16,17]) = ...

Finally we take argmin(score).

In the paper, I described this using argmax of cosine-similarity instead of argmin of euclidean-distance. But in the end they are the same thing.

Answer 3 · 2023-07-31T16:47:37.000Z

This is a great example, I want to ask regarding such cases, in order to evaluate, gt_ids = test_ids + dummy_db_shape[0] in the original code considers same number of fp for the db and queries, but in this case it is not, then I should find at what time it starts and find the idx for the start point fp to find ground truths right?
Also this code fake_recon_index, index_shape = load_memmap_data( emb_dummy_dir, 'dummy_db', append_extra_length=query_shape[0], display=False) fake_recon_index[dummy_db_shape[0]:dummy_db_shape[0] + query_shape[0], :] = db[:, :]
only works if query_shape[0] = db_shape[0] otherwise I should use db_shape[0] am i right?

Answer 4 · 2023-08-02T23:47:55.000Z

@mhmd-mst

If what I remember is correct, gt_ids = test_ids + dummy_db_shape[0] was required because our DB was always constructed as dummy_db first, and test_db follows the dummy_db indices.
If you set dumm_db as empty (using np.empty()), then simplygt_ids = test_ids.
If you don't use dummy_db, again, it will be simply fake_recon_index = db. The reason why I use the on-disk solution of memmap variables like fake_recon_index is to save memory. Imagine if dummy_db was your actual DB with size of 2TB!