DwangoMediaVillage/pqkmeans

Sparse Vector

Closed this issue · 4 comments

Just wondering,
if pqkmeans can be optimized for sparse vectors
, csr format, instead of dense ones

thank you

No. You need to convert sparse vectors to dense ones before applying pqkmeans.

My general advice is to project the sparse data to a lower (denser) space, e.g., from D=10K to D=100. That way, PQ will work better.

You can try multiplying by a random matrix. Multiplication of a matrix and a sparse vector would be generally fast.