questions on beam_search method
gitfourteen opened this issue · 4 comments
Can someone explain why use / here? We can not guarantee pre_hyp_ids an integer. Why not use // instead?
IS THIS A TYPO?
prev_hyp_ids = top_cand_hyp_pos / len(self.vocab.tgt)
https://github.com/pcyin/pytorch_basic_nmt/blob/master/nmt.py#L340
And I would update my understanding on pre_hyp_ids. It is a list of at most beam_size indices that locate where the top-k scours belong, i.e., the indices of hypotheses, for every beam search step. The items of pre_hyp_ids may have repetitive ones.
for example, set beam_size = 4, in previous beam search step, we have already got 1 sentence for completed_hypotheses, and
hypotheses = [['<s>',word_1, word_2],['<s>',word_3, word_4], ['<s>',word_5, word_6] ]
We then find the top 2 words following word_2, and the third highest word following word_6, then we have
pre_hyp_ids = [0, 0, 2]
How to get this result?
We further assume these 3 words are the 100th, 1,000th, 9,000th words in the vocabulary,
and the length of the vocabulary is 50000
top_cand_hyp_pos = [100, 1000, 9000+2*50000]
pre_hyp_ids = [100, 1000, 9000+2*50000] // 50000 = [0, 0, 2]
and if I was right, the beam search here simply sorted the sum of scores of the candidate sentences to get the final choice. It did not normalize the score with sentence lengths. Is this just for teaching purpose so that students can optimize base on this?
completed_hypotheses.sort(key=lambda hyp: hyp.score, reverse=True)
https://github.com/pcyin/pytorch_basic_nmt/blob/master/nmt.py#L376
top_hypotheses = [hyps[0] for hyps in hypotheses]
https://github.com/pcyin/pytorch_basic_nmt/blob/master/nmt.py#L755
Can someone explain why use / here? We can not guarantee pre_hyp_ids an integer. Why not use // instead?
IS THIS A TYPO?
prev_hyp_ids = top_cand_hyp_pos / len(self.vocab.tgt)
https://github.com/pcyin/pytorch_basic_nmt/blob/master/nmt.py#L340
I found the answer for the first question here:
https://github.com/pytorch/pytorch/issues/5411
The definition of integer division of Pytorch is different from Numpy/Python3.
Anyway, the latest Pytorch(torch.version = '1.1.0') can use floor division // to replace / in the beam_search method.
Hi, the type of top_cand_hyp_pos
should be Torch.LongTensor
, so is the prev_hyp_ids
Sorry for my late reply! For your second question, note that the shape of top_cand_hyp_pos
is number_of_hypotheses * vocab_size
. Therefore, by dividing the size of top_cand_hyp_pos
with the vocab size, we can get the id of the hypothesis a top-ranked token belongs to