backward似乎不正确
fmscole opened this issue · 1 comments
fmscole commented
代码中的backward似乎不正确,主要问题是后一层 “h” 的梯度没有向前面一层传递
修改后的代码为:
def backword(self, x, y, h, output, lr=0.002):
T = x.shape[1]
D_Wyh = np.zeros_like(self.Wyh)
D_Byh = np.zeros_like(self.Byh)
D_Whh = np.zeros_like(self.Whh)
D_Bhh = np.zeros_like(self.Bhh)
D_Wxh = np.zeros_like(self.Wxh)
D_Bxh = np.zeros_like(self.Bxh)
for t in range(T-1, -1, -1):
dQ = output[t] - y[:, t]
if t==T-1:
dL_ht = np.dot(np.transpose(self.Wyh), dQ)
else:
dL_ht += np.dot(np.transpose(self.Wyh), dQ)
D_Wyh += np.outer(dQ, h[t])
D_Byh += dQ
dh = (1 - h[t]*h[t])
dt=dh*dL_ht
D_Wxh += np.outer(dt, x[:, t])
D_Bxh += dt
D_Whh += np.outer(dt, h[t-1])
D_Bhh += dt
dL_ht=np.dot(np.transpose(self.Whh),dt)
for dparam in [D_Wyh, D_Byh, D_Wxh, D_Bxh, D_Whh, D_Bhh]:
np.clip(dparam, -5, 5, out=dparam)
self.Wyh -= lr*D_Wyh
self.Wxh -= lr*D_Wxh
self.Whh -= lr*D_Whh
self.Byh -= lr*D_Byh
self.Bhh -= lr*D_Bhh
self.Bxh -= lr*D_Bxh
self.h -= lr*dL_ht
fmscole commented
一条代码被分成了两截,github也有bug啊