deepseek-ai/DeepSeek-V2

Error in Equation 16?

zhongmz opened this issue · 1 comments

It appears that the current formulation is ?

$$
q_{t,i} = [q^{C}{t,i};q{t}^R],
$$

The formula is correct. q^R are multi-head, and only k^R is shared.
You can refer to the illustration of DeepSeek-V2 for an intuitive understanding.