LeapLabTHU/MLLA

Definitional Problems with Forget Gates

lazywu170101 opened this issue · 0 comments

Hello author, I am honoured to read your paper, which is clear and concise in its experimental logic. However, in the process of reading, I'm having a hard time distinguishing what components is forget gate, in order to solve this problem I tried to find the relevant descriptions in the mamba article and the vim article, but unfortunately did not succeed. So I would like to ask what exactly is the setup of the forget gate?