Slight error in ADAGrad, momentum should be momentum(i)

Question

Slight error in ADAGrad, momentum should be momentum(i)

Opened this issue 7 years ago · 0 comments

Currently on the master branch, in the ADAGrad code (link to code), in lines 145 and 148, we have:

 ADAGrad.ADAGradm(gmm, gum, gss, momentum.asInstanceOf[GMat], mu, mask.asInstanceOf[GMat], nw.dv.toFloat, gve, gts, glrate, opts.langevin, opts.epsilon, (opts.waitsteps < nsteps));

and

 ADAGrad.ADAGradn(gmm, gum, gss, momentum.asInstanceOf[GMat], mu, mask.asInstanceOf[GMat], nw.dv.toFloat, gve, gts, glrate, opts.langevin, opts.epsilon, (opts.waitsteps < nsteps));

Here, momentum is being cast to a single GMat. However, momentum was defined as an array of GMats at the start of the code. Running ADAGrad with momentum will result in an error since an array can't be cast to a matrix. The fix is to change momentum to momentum(i) for both of these cases, which makes sense since this is in a loop around all the model matrices, and also, the CPU version uses momentum(i).

When I changed it to momentum(i), the GPU version of ADAGrad works correctly.