Definitional Problems with Forget Gates
lazywu170101 opened this issue · 0 comments
lazywu170101 commented
Hello author, I am honoured to read your paper, which is clear and concise in its experimental logic. However, in the process of reading, I'm having a hard time distinguishing what components is forget gate, in order to solve this problem I tried to find the relevant descriptions in the mamba article and the vim article, but unfortunately did not succeed. So I would like to ask what exactly is the setup of the forget gate?