messense/jieba-rs

Reduce the BinaryHeap size to be `k`

MnO2 opened this issue · 1 comments

MnO2 commented

To retrieve the top k, we only need k node for BinaryHeap but not n.

We need to rewrite the following snippets by using a min heap.

        let mut heap = BinaryHeap::new();
        for (k, v) in ranking_vector.iter().enumerate() {
            heap.push(HeapNode {
                rank: (v * 1e10) as u64,
                word_id: k,
            })
        }

        let mut res: Vec<String> = Vec::new();
        for _ in 0..top_k {
            if let Some(w) = heap.pop() {
                res.push(unique_words[w.word_id].clone());
            }
        }

Fixed in #28