google/deluca

Implementation of drc

FarnazAdib opened this issue · 4 comments

Hi

Thanks for providing this interesting package.

I am trying to test drc on a simple setup and I notice that the current implementation of drc does not work. I mean when I try it for a simple partially observable linear system with
A = np.array([[1.0 0.95], [0.0, -0.9]]),
B = np.array([[0.0], [1.0]])
C = np.array([[1.0, 0]])
Q , R = I
gaussian process noise, zero observation noise
which is open loop stable, the controller acts like a zero controller. I tried to get a different response by setting the hyperparameters but they are mostly the same.
Then I looked at the implementation at the deluca github and I noticed that the counterfactual cost is not defined correctly (if I am not wrong). According to Algorithm 1 in [1], we need to use M_t to compute y_t (which depends on the previous controls (u) using again M_t) but in the implementation, the previous controls based on M_{t-i} are used. Anyway, I implemented the algorithm using M_t but what I get after the simulation is either close to zero control or an unstable one.

I was wondering if you have any code example for the DRC algorithm that works?
[1] Simchowitz, Max and Singh, Karan and Hazan, Elad, "Improper learning for non-stochastic control", COLT 2020.

Thanks a lot,
Sincerely,
Farnaz

Hi Daniel,

Thank you very much for your response.

In your paper "Deluca -- A Differentiable Control Library: Environments, Methods, and Benchmarking", it is mentioned that DRC is an implementation of [1]. Anyway, you said it is a ``sample implementation'' of DRC, so you have probably tried your implementation on something. I was wondering if I can have that one?

Thank you very much for your time and help!
Best regards,
Farnaz

Ok.
I don't follow up this point anymore.