qixianbiao/RNN

backward似乎不正确

fmscole opened this issue · 1 comments

代码中的backward似乎不正确,主要问题是后一层 “h” 的梯度没有向前面一层传递
修改后的代码为:

def backword(self, x, y, h, output, lr=0.002):
        T = x.shape[1]
        D_Wyh = np.zeros_like(self.Wyh)
        D_Byh = np.zeros_like(self.Byh)
        D_Whh = np.zeros_like(self.Whh)
        D_Bhh = np.zeros_like(self.Bhh)
        D_Wxh = np.zeros_like(self.Wxh)
        D_Bxh = np.zeros_like(self.Bxh)
        
        for t in range(T-1, -1, -1):
            dQ = output[t] - y[:, t]
            if t==T-1:
                dL_ht = np.dot(np.transpose(self.Wyh), dQ)
            else:
                dL_ht += np.dot(np.transpose(self.Wyh), dQ)

            D_Wyh += np.outer(dQ, h[t])
            D_Byh += dQ

            dh = (1 - h[t]*h[t])
            dt=dh*dL_ht
            D_Wxh += np.outer(dt, x[:, t])
            D_Bxh += dt

            D_Whh += np.outer(dt, h[t-1])
            D_Bhh += dt

            dL_ht=np.dot(np.transpose(self.Whh),dt)   
             
        for dparam in [D_Wyh, D_Byh, D_Wxh, D_Bxh, D_Whh, D_Bhh]:
            np.clip(dparam, -5, 5, out=dparam)

        self.Wyh -= lr*D_Wyh
        self.Wxh -= lr*D_Wxh
        self.Whh -= lr*D_Whh
        self.Byh -= lr*D_Byh
        self.Bhh -= lr*D_Bhh
        self.Bxh -= lr*D_Bxh
        self.h -= lr*dL_ht 

一条代码被分成了两截,github也有bug啊