Is the order of the args in lstsq in the APS agent's regress meta function correct?
raymondchua opened this issue · 0 comments
raymondchua commented
I am curious why in the regress_meta function in the APS agent, the reward is the first argument and not rep? According to the torch documentation, the lstsq function tries to find X in the the equation ||AX -B||_F
, with A being the first argument and B being the second argument of the function. Since the equation that we are trying to solve is finding w, such that ||rep * w - reward||
, shouldn't A be rep instead?
task = torch.linalg.lstsq(reward, rep)[0][:rep.size(1), :][0]