rll-research/url_benchmark

Is the order of the args in lstsq in the APS agent's regress meta function correct?

raymondchua opened this issue · 0 comments

I am curious why in the regress_meta function in the APS agent, the reward is the first argument and not rep? According to the torch documentation, the lstsq function tries to find X in the the equation ||AX -B||_F, with A being the first argument and B being the second argument of the function. Since the equation that we are trying to solve is finding w, such that ||rep * w - reward||, shouldn't A be rep instead?

task = torch.linalg.lstsq(reward, rep)[0][:rep.size(1), :][0]