model ensemble
zcyang opened this issue · 0 comments
Hi,
Is model ensemble part available? Or at least can you explain how to do model ensemble.
Here is my currently understanding:
Give the sequence x_1, x_2,...x_n
, and two models m_1, m_2
. First use the encoder part of m_1, m_2
to get the states s_1, s_2
, then feed the states to the decoder parts of m_1, m_2
, the first input to the decoder is EOS
, then the decoder generates the probability for the next words p_1, p_2
, average the probability to get (p_1 + p_2)/2
, then pick the top k words according to the average probability as candidates, and then use them to feed into the next steps, the states s_1, s_2
are both updated after the first step.
Is this correct? I tried to implement using the above way, but don't find improvement using several models.