questions on beam_search method

Question

questions on beam_search method

gitfourteen opened this issue 6 years ago · 4 comments

Can someone explain why use / here? We can not guarantee pre_hyp_ids an integer. Why not use // instead?

IS THIS A TYPO?

prev_hyp_ids = top_cand_hyp_pos / len(self.vocab.tgt)
https://github.com/pcyin/pytorch_basic_nmt/blob/master/nmt.py#L340

And I would update my understanding on pre_hyp_ids. It is a list of at most beam_size indices that locate where the top-k scours belong, i.e., the indices of hypotheses, for every beam search step. The items of pre_hyp_ids may have repetitive ones.

for example, set beam_size = 4, in previous beam search step, we have already got 1 sentence for completed_hypotheses, and
hypotheses = [['<s>',word_1, word_2],['<s>',word_3, word_4], ['<s>',word_5, word_6] ]

We then find the top 2 words following word_2, and the third highest word following word_6, then we have
pre_hyp_ids = [0, 0, 2]

How to get this result?
We further assume these 3 words are the 100th, 1,000th, 9,000th words in the vocabulary,
and the length of the vocabulary is 50000

top_cand_hyp_pos =  [100, 1000, 9000+2*50000]
pre_hyp_ids = [100, 1000, 9000+2*50000] // 50000 = [0, 0, 2]

Answer 1 · 2019-07-05T10:49:12.000Z

and if I was right, the beam search here simply sorted the sum of scores of the candidate sentences to get the final choice. It did not normalize the score with sentence lengths. Is this just for teaching purpose so that students can optimize base on this?

completed_hypotheses.sort(key=lambda hyp: hyp.score, reverse=True)
https://github.com/pcyin/pytorch_basic_nmt/blob/master/nmt.py#L376

top_hypotheses = [hyps[0] for hyps in hypotheses]
https://github.com/pcyin/pytorch_basic_nmt/blob/master/nmt.py#L755

Answer 2 · 2019-07-11T08:47:17.000Z

Can someone explain why use / here? We can not guarantee pre_hyp_ids an integer. Why not use // instead?

IS THIS A TYPO?

prev_hyp_ids = top_cand_hyp_pos / len(self.vocab.tgt)
https://github.com/pcyin/pytorch_basic_nmt/blob/master/nmt.py#L340

I found the answer for the first question here:
https://github.com/pytorch/pytorch/issues/5411

The definition of integer division of Pytorch is different from Numpy/Python3.
Anyway, the latest Pytorch(torch.version = '1.1.0') can use floor division // to replace / in the beam_search method.

Answer 3 · 2019-07-15T13:14:38.000Z

Hi, the type of top_cand_hyp_pos should be Torch.LongTensor, so is the prev_hyp_ids

Answer 4 · 2019-07-15T13:17:41.000Z

Sorry for my late reply! For your second question, note that the shape of top_cand_hyp_pos is number_of_hypotheses * vocab_size. Therefore, by dividing the size of top_cand_hyp_pos with the vocab size, we can get the id of the hypothesis a top-ranked token belongs to